Quantum Algebra and
Symmetry
Quantum Algebraic Topology, Quantum
Field Theories and Higher Dimensional
Algebra
PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.
PDF generated at: Sun, 12 Dec 2010 01:24:24 UTC
Contents
Articles
Copyright©2010 by LC. Baianu ;v5, December 10, 2010
I.C.Baianu,Ph.D., M.Inst.P, Editor (with listed contributors)
Quantum Theories 3
Quantum mechanics 3
S chrodinger equation 1 7
Dirac equation 25
Klein-Gordon equation 36
Einstein-Maxwell-Dirac equations 40
Rigged Hilbert space 41
Quantum inverse scattering method 43
Quasi-Hopf algebra 43
Quasitriangular Hopf algebra 44
Ribbon Hopf algebra 45
Quasi-triangular Quasi-Hopf algebra 46
Grassmann algebra 47
Supergroup 61
Superalgebra 62
Supergravity 65
Quantum statistical mechanics 7 1
Quantum thermodynamics 73
Supertheory 74
Quantum Algebra 81
Quantum algebra 8 1
Lie algebra 81
Lie group 86
Hopf algebra 96
Quantum group 101
Affine quantum group 108
Affine Lie algebra 109
Quantum affine algebra 111
Operator algebra 112
Clifford algebra
Distributions
Hilbert space
Von Neumann algebra
C* -algebra
Kac-Moody algebra
Spectral theory
Quantum Field Theory, SUSY, Quantum
Algebraic Topology
Quantum electrodynamics
Quantum field theory
Scalar field theory
Yang-Mills theory
Yangian
Quantum spacetime
Quantum gauge theory
Standard Model
Topological quantum field theory
Quantum Chromodynamics
Quantum Geometry
Loop Quantum Gravity
Quantum Algebraic Topology
Commutativity
Noncommutative quantum field theory
Noncommutative standard model
Nonabelian Gauge Theory
List of quantum field theories
Noncommutative geometry
Quantum gravity
String Theories
Superstring Theories
Group Representations and Symmetry
Symmetry
Representations
Representations of the symmetric group
Representations of a finite group
113
123
134
161
171
176
178
Geometry and Quantum
185
185
197
211
218
223
225
230
230
244
247
255
256
262
268
273
274
276
286
287
291
300
318
324
324
338
340
342
Representations of finite groups of Lie type 349
Representations of Lie groups 355
Representations of Lie algebras 357
Representations of the Poincare group 359
Representation theory of the Lorentz group 360
Double group 364
Noether's Theorem 371
Goldstone theorem 384
Stone- von Neumann theorem 387
Peter- Weyl theorem 392
Supersymmetry 394
In variance mechanics 401
Wigner's theorem 405
Wigner's classification 405
Quantum hydrodynamics 406
Quantum magnetodynamics 407
Nonabelian group 419
Higher Dimensional Algebra 42 o
Algebraic Topology 420
Category Theory 424
Double Groupoid 43 1
Higher dimensional algebra 433
Higher category theory 437
Duality (mathematics) 439
Anabelian geometry 447
Noncommutative geometry 448
Quantum Logics and Quantum Computers 452
Multi-valued logic 452
Quantum information 454
Quantum logic 456
Quantum computer 463
Algebraic Logic 472
Boolean logic 472
Algebraic logic 478
Lukasiewicz logic 480
Intuitionistic logic 482
Mathematical logic 487
Hey ting arithmetic 501
Symbolic logic 501
Metatheory 502
Metalogic 503
Quantum Biographies 506
Niels Bohr 506
Max Planck 514
Louis de Broglie 523
The Lord Rutherford of Nelson 526
Albert Einstein 533
Erwin Schrodinger 562
John von Neumann 568
John van Vleck 580
Paul Dirac 583
Werner Heisenberg 592
Albert Einstein 615
Max Born 644
Eugene Wigner 655
Alexandru Proca 661
Hideki Yukawa 663
Neville Mott 665
Paul W. Anderson 667
George Karreman 670
Alberte Pullman 672
Richard Feynman 673
Murray Gell-Mann 689
Stephen Weinberg 694
Anatole Abragam 699
§tefan Procopiu 701
Ionel Solomon 703
References
Article Sources and Contributors 706
Image Sources, Licenses and Contributors 721
License
1
Copyright©2010 by I.C. Baianu ;v5,
December 10, 2010
.C.Baianu,Ph.D., M.Inst.P, Editor (with
listed contributors)
3
Quantum Theories
Quantum mechanics
Quantum mechanics, also known as quantum
physics or quantum theory, is a branch of
physics providing a mathematical description of
much of the dual particle-like and wave-like
behavior and interactions of energy and matter. It
departs from classical mechanics primarily at the
atomic and subatomic scales, the so-called
quantum realm. In advanced topics of quantum
mechanics, some of these behaviors are
macroscopic and only emerge at very low or very
high energies or temperatures. The name, coined
by Max Planck, derives from the observation that
some physical quantities can be changed only by
discrete amounts, or quanta, as multiples of the
Planck constant, rather than being capable of
varying continuously or by any arbitrary amount.
For example, the angular momentum, or more
generally the action, of an electron bound into an
atom or molecule is quantized. While an unbound
electron does not exhibit quantized energy levels,
an electron bound in an atomic orbital has
quantized values of angular momentum. In the
context of quantum mechanics, the wave-particle
duality of energy and matter and the uncertainty
principle provide a unified view of the behavior
of photons, electrons and other atomic- scale
objects.
s p
•
d
i
2
Fig. 1: Probability densities corresponding to the wavefunctions of an
electron in a hydrogen atom possessing definite energy levels (increasing
from the top of the image to the bottom: n = 1, 2, 3, ...) and angular
momentum (increasing across from left to right: s, p, d, ...). Brighter areas
correspond to higher probability density in a position measurement.
Wavefunctions like these are directly comparable to Chladni's figures of
acoustic modes of vibration in classical physics and are indeed modes of
oscillation as well: they possess a sharp energy and thus a keen frequency.
The angular momentum and energy are quantized, and only take on discrete
values like those shown (as is the case for resonant frequencies in
acoustics).
The mathematical formulations of quantum mechanics are abstract. Similarly, the implications are often
non-intuitive in terms of classic physics. The centerpiece of the mathematical system is the wavefunction. The
wavefunction is a mathematical function providing information about the probability amplitude of position and
momentum of a particle. Mathematical manipulations of the wavefunction usually involve the bra-ket notation,
which requires an understanding of complex numbers and linear functionals. The wavefunction treats the object as a
quantum harmonic oscillator and the mathematics is akin to that of acoustic resonance. Many of the results of
quantum mechanics do not have models that are easily visualized in terms of classical mechanics; for instance, the
ground state in the quantum mechanical model is a non-zero energy state that is the lowest permitted energy state of
a system, rather than a more traditional system that is thought of as simply being at rest with zero kinetic energy.
Historically, the earliest versions of quantum mechanics were formulated in the first decade of the 20th century at
around the same time as the atomic theory and the corpuscular theory of light as updated by Einstein first came to be
widely accepted as scientific fact; these latter theories can be viewed as quantum theories of matter and
Quantum mechanics
4
electromagnetic radiation. Quantum theory was significantly reformulated in the mid- 1920s away from the old
quantum theory towards the quantum mechanics formulated by Werner Heisenberg, Max Born, Wolfgang Pauli and
their associates, accompanied by the acceptance of the Copenhagen interpretation of Niels Bohr. By 1930, quantum
mechanics had been further unified and formalized by the work of Paul Dirac and John von Neumann, with a greater
emphasis placed on measurement in quantum mechanics, the statistical nature of our knowledge of reality and
philosophical speculation about the role of the observer. Quantum mechanics has since branched out into almost
every aspect of 20th century physics and other disciplines such as quantum chemistry, quantum electronics, quantum
optics and quantum information science. Much 19th century physics has been re-evaluated as the classical limit of
quantum mechanics, and its more advanced developments in terms of quantum field theory, string theory, and
speculative quantum gravity theories.
History
The history of quantum mechanics dates back to the 1838 discovery of cathode rays by Michael Faraday. This was
followed by the 1859 statement of the black body radiation problem by Gustav Kirchhoff, the 1877 suggestion by
Ludwig Boltzmann that the energy states of a physical system can be discrete, and the 1900 quantum hypothesis of
Max Planck. tl] Planck's hypothesis that energy is radiated and absorbed in discrete "quanta", or "energy elements",
enabled the correct derivation of the observed patterns of black body radiation. According to Planck, each energy
element E is proportional to its frequency v:
E = hv
where h is Planck's action constant. Planck cautiously insisted that this was simply an aspect of the processes of
absorption and emission of radiation and had nothing to do with the physical reality of the radiation itself.
However, in 1905 Albert Einstein interpreted Planck's quantum hypothesis realistically and used it to explain the
photoelectric effect, in which shining light on certain materials can eject electrons from the material. Einstein
postulated that light itself consists of individual quanta of energy, later called photons. 1 J
The foundations of quantum mechanics were established during the first half of the twentieth century by Niels Bohr,
Werner Heisenberg, Max Planck, Louis de Broglie, Albert Einstein, Erwin Schrodinger, Max Born, John von
Neumann, Paul Dirac, Wolfgang Pauli, David Hilbert, and others. In the mid- 1920s, developments in quantum
mechanics quickly led to its becoming the standard formulation for atomic physics. In the summer of 1925, Bohr and
Heisenberg published results that closed the "Old Quantum Theory". Out of deference to their dual state as particles,
light quanta came to be called photons (1926). From Einstein's simple postulation was born a flurry of debating,
theorizing and testing. Thus, the entire field of quantum physics emerged leading to its wider acceptance at the Fifth
Solvay Conference in 1927.
The other exemplar that led to quantum mechanics was the study of electromagnetic waves such as light. When it
was found in 1900 by Max Planck that the energy of waves could be described as consisting of small packets or
quanta, Albert Einstein further developed this idea to show that an electromagnetic wave such as light could be
described by a particle called the photon with a discrete energy dependent on its frequency. This led to a theory of
unity between subatomic particles and electromagnetic waves called wave-particle duality in which particles and
waves were neither one nor the other, but had certain properties of both. While quantum mechanics describes the
world of the very small, it also is needed to explain certain macroscopic quantum systems such as superconductors
and superfluids.
The word quantum derives from Latin meaning "how great" or "how much" J 4] In quantum mechanics, it refers to a
discrete unit that quantum theory assigns to certain physical quantities, such as the energy of an atom at rest (see
Figure 1). The discovery that particles are discrete packets of energy with wave-like properties led to the branch of
physics that deals with atomic and subatomic systems which is today called quantum mechanics. It is the underlying
mathematical framework of many fields of physics and chemistry, including condensed matter physics, solid-state
physics, atomic physics, molecular physics, computational physics, computational chemistry, quantum chemistry,
Quantum mechanics
5
particle physics, nuclear chemistry, and nuclear physics. Some fundamental aspects of the theory are still actively
studied.^ Quantum mechanics is essential to understand the behavior of systems at atomic length scales and smaller.
For example, if classical mechanics governed the workings of an atom, electrons would rapidly travel towards and
collide with the nucleus, making stable atoms impossible. However, in the natural world the electrons normally
remain in an uncertain, non-deterministic "smeared" (wave-particle wave function) orbital path around or through
the nucleus, defying classical electromagnetism. Quantum mechanics was initially developed to provide a better
explanation of the atom, especially the spectra of light emitted by different atomic species. The quantum theory of
the atom was developed as an explanation for the electron's staying in its orbital, which could not be explained by
Newton's laws of motion and by Maxwell's laws of classical electromagnetism. Broadly speaking, quantum
mechanics incorporates four classes of phenomena for which classical physics cannot account:
• The quantization (discretization) of certain physical quantities
• wave-particle duality
• uncertainty principle
• quantum entanglement
Mathematical formulations
In the mathematically rigorous formulation of quantum mechanics developed by Paul Dirac^ and John von
rm
Neumann, 1 the possible states of a quantum mechanical system are represented by unit vectors (called "state
vectors"). Formally, these reside in a complex separable Hilbert space (variously called the "state space" or the
"associated Hilbert space" of the system) well defined up to a complex number of norm 1 (the phase factor). In other
words, the possible states are points in the projectivization of a Hilbert space, usually called the complex projective
space. The exact nature of this Hilbert space is dependent on the system; for example, the state space for position and
momentum states is the space of square-integrable functions, while the state space for the spin of a single proton is
just the product of two complex planes. Each observable is represented by a maximally Hermitian (precisely: by a
self-adjoint) linear operator acting on the state space. Each eigenstate of an observable corresponds to an eigenvector
of the operator, and the associated eigenvalue corresponds to the value of the observable in that eigenstate. If the
operator's spectrum is discrete, the observable can only attain those discrete eigenvalues.
In the formalism of quantum mechanics, the state of a system at a given time is described by a complex wave
function, also referred to as state vector in a complex vector space J 10 ^ This abstract mathematical object allows for
the calculation of probabilities of outcomes of concrete experiments. For example, it allows one to compute the
probability of finding an electron in a particular region around the nucleus at a particular time. Contrary to classical
mechanics, one can never make simultaneous predictions of conjugate variables, such as position and momentum,
with accuracy. For instance, electrons may be considered to be located somewhere within a region of space, but with
their exact positions being unknown. Contours of constant probability, often referred to as "clouds", may be drawn
around the nucleus of an atom to conceptualize where the electron might be located with the most probability.
Heisenberg's uncertainty principle quantifies the inability to precisely locate the particle given its conjugate
momentum J- U ^
As the result of a measurement, the wave function containing the probability information for a system collapses from
a given initial state to a particular eigenstate of the observable. The possible results of a measurement are the
eigenvalues of the operator representing the observable — which explains the choice of Hermitian operators, for
which all the eigenvalues are real. We can find the probability distribution of an observable in a given state by
computing the spectral decomposition of the corresponding operator. Heisenberg's uncertainty principle is
represented by the statement that the operators corresponding to certain observables do not commute.
The probabilistic nature of quantum mechanics thus stems from the act of measurement. This is one of the most
difficult aspects of quantum systems to understand. It was the central topic in the famous Bohr-Einstein debates, in
which the two scientists attempted to clarify these fundamental principles by way of thought experiments. In the
Quantum mechanics
6
decades after the formulation of quantum mechanics, the question of what constitutes a "measurement" has been
extensively studied. Interpretations of quantum mechanics have been formulated to do away with the concept of
"wavefunction collapse"; see, for example, the relative state interpretation. The basic idea is that when a quantum
system interacts with a measuring apparatus, their respective wavefunctions become entangled, so that the original
quantum system ceases to exist as an independent entity. For details, see the article on measurement in quantum
mechanics. Generally, quantum mechanics does not assign definite values to observables. Instead, it makes
predictions using probability distributions; that is, the probability of obtaining possible outcomes from measuring an
ri3i
observable. Often these results are skewed by many causes, such as dense probability clouds or quantum state
nuclear attraction J ^ Naturally, these probabilities will depend on the quantum state at the "instant" of the
measurement. Hence, uncertainty is involved in the value. There are, however, certain states that are associated with
a definite value of a particular observable. These are known as eigenstates of the observable ("eigen" can be
translated from German as inherent or as a characteristic).^
In the everyday world, it is natural and intuitive to think of everything (every observable) as being in an eigenstate.
Everything appears to have a definite position, a definite momentum, a definite energy, and a definite time of
occurrence. However, quantum mechanics does not pinpoint the exact values of a particle for its position and
momentum (since they are conjugate pairs) or its energy and time (since they too are conjugate pairs); rather, it only
provides a range of probabilities of where that particle might be given its momentum and momentum probability.
Therefore, it is helpful to use different words to describe states having uncertain values and states having definite
values (eigenstate). Usually, a system will not be in an eigenstate of the observable we are interested in. However, if
one measures the observable, the wavefunction will instantaneously be an eigenstate (or generalized eigenstate) of
ri7i
that observable. This process is known as wavefunction collapse, a debatable process. It involves expanding the
system under study to include the measurement device. If one knows the corresponding wave function at the instant
before the measurement, one will be able to compute the probability of collapsing into each of the possible
eigenstates. For example, the free particle in the previous example will usually have a wavefunction that is a wave
packet centered around some mean position x , neither an eigenstate of position nor of momentum. When one
ri2i
measures the position of the particle, it is impossible to predict with certainty the result. It is probable, but not
certain, that it will be near x , where the amplitude of the wave function is large. After the measurement is
ri8i
performed, having obtained some result x, the wave function collapses into a position eigenstate centered at x.
The time evolution of a quantum state is described by the Schrodinger equation, in which the Hamiltonian, the
operator corresponding to the total energy of the system, generates time evolution. The time evolution of wave
functions is deterministic in the sense that, given a wavefunction at an initial time, it makes a definite prediction of
ri9i
what the wavefunction will be at any later time.
During a measurement, on the other hand, the change of the wavefunction into another one is not deterministic, but
rather unpredictable, i.e., random. A time-evolution simulation can be seen hereJ 20 ^ ^ Wave functions can change
as time progresses. An equation known as the Schrodinger equation describes how wave functions change in time, a
role similar to Newton's second law in classical mechanics. The Schrodinger equation, applied to the aforementioned
example of the free particle, predicts that the center of a wave packet will move through space at a constant velocity,
like a classical particle with no forces acting on it. However, the wave packet will also spread out as time progresses,
which means that the position becomes more uncertain. This also has the effect of turning position eigenstates
(which can be thought of as infinitely sharp wave packets) into broadened wave packets that are no longer position
eigenstates P 2 ^
Some wave functions produce probability distributions that are constant, or independent of time, such as when in a
stationary state of constant energy, time drops out of the absolute square of the wave function. Many systems that are
treated dynamically in classical mechanics are described by such "static" wave functions. For example, a single
electron in an unexcited atom is pictured classically as a particle moving in a circular trajectory around the atomic
nucleus, whereas in quantum mechanics it is described by a static, spherically symmetric wavefunction surrounding
Quantum mechanics
7
the nucleus (Fig. 1). (Note that only the lowest angular momentum states, labeled s, are spherically symmetric).
The Schrodinger equation acts on the entire probability amplitude, not merely its absolute value. Whereas the
absolute value of the probability amplitude encodes information about probabilities, its phase encodes information
about the interference between quantum states. This gives rise to the wave-like behavior of quantum states. It turns
out that analytic solutions of Schrodinger's equation are only available for a small number of model Hamiltonians, of
which the quantum harmonic oscillator, the particle in a box, the hydrogen molecular ion and the hydrogen atom are
the most important representatives. Even the helium atom, which contains just one more electron than hydrogen,
defies all attempts at a fully analytic treatment. There exist several techniques for generating approximate solutions.
For instance, in the method known as perturbation theory one uses the analytic results for a simple quantum
mechanical model to generate results for a more complicated model related to the simple model by, for example, the
addition of a weak potential energy. Another method is the "semi-classical equation of motion" approach, which
applies to systems for which quantum mechanics produces weak deviations from classical behavior. The deviations
can be calculated based on the classical motion. This approach is important for the field of quantum chaos.
There are numerous mathematically equivalent formulations of quantum mechanics. One of the oldest and most
commonly used formulations is the transformation theory proposed by Cambridge theoretical physicist Paul Dirac,
which unifies and generalizes the two earliest formulations of quantum mechanics, matrix mechanics (invented by
Werner Heisenberg)^ ^ and wave mechanics (invented by Erwin Schrodinger).^ 26 ^ In this formulation, the
instantaneous state of a quantum system encodes the probabilities of its measurable properties, or "observables".
Examples of observables include energy, position, momentum, and angular momentum. Observables can be either
i"27]
continuous (e.g., the position of a particle) or discrete (e.g., the energy of an electron bound to a hydrogen atom).
An alternative formulation of quantum mechanics is Feynman's path integral formulation, in which a
quantum-mechanical amplitude is considered as a sum over histories between initial and final states; this is the
quantum-mechanical counterpart of action principles in classical mechanics.
Interactions with other scientific theories
The fundamental rules of quantum mechanics are very deep. They assert that the state space of a system is a Hilbert
space and the observables are Hermitian operators acting on that space, but do not tell us which Hilbert space or
which operators, or if it even exists. These must be chosen appropriately in order to obtain a quantitative description
of a quantum system. An important guide for making these choices is the correspondence principle, which states that
the predictions of quantum mechanics reduce to those of classical physics when a system moves to higher energies
or equivalently, larger quantum numbers. In other words, classical mechanics is simply a quantum mechanics of
large systems. This "high energy" limit is known as the classical or correspondence limit. One can therefore start
from an established classical model of a particular system, and attempt to guess the underlying quantum model that
gives rise to the classical model in the correspondence limit.
When quantum mechanics was originally formulated, it was applied to models whose correspondence limit was
non-relativistic classical mechanics. For instance, the well-known model of the quantum harmonic oscillator uses an
explicitly non-relativistic expression for the kinetic energy of the oscillator, and is thus a quantum version of the
classical harmonic oscillator.
Early attempts to merge quantum mechanics with special relativity involved the replacement of the Schrodinger
equation with a covariant equation such as the Klein-Gordon equation or the Dirac equation. While these theories
were successful in explaining many experimental results, they had certain unsatisfactory qualities stemming from
their neglect of the relativistic creation and annihilation of particles. A fully relativistic quantum theory required the
development of quantum field theory, which applies quantization to a field rather than a fixed set of particles. The
first complete quantum field theory, quantum electrodynamics, provides a fully quantum description of the
electromagnetic interaction. The full apparatus of quantum field theory is often unnecessary for describing
electrodynamic systems. A simpler approach, one employed since the inception of quantum mechanics, is to treat
Quantum mechanics
8
charged particles as quantum mechanical objects being acted on by a classical electromagnetic field. For example,
the elementary quantum model of the hydrogen atom describes the electric field of the hydrogen atom using a
classical - 47r e eQ Coulomb potential. This "semi-classical" approach fails if quantum fluctuations in the
electromagnetic field play an important role, such as in the emission of photons by charged particles. Quantum field
theories for the strong nuclear force and the weak nuclear force have been developed. The quantum field theory of
the strong nuclear force is called quantum chromodynamics, and describes the interactions of the subnuclear
particles: quarks and gluons. The weak nuclear force and the electromagnetic force were unified, in their quantized
forms, into a single quantum field theory known as electroweak theory, by the physicists Abdus Salam, Sheldon
Glashow and Steven Weinberg. These three men shared the Nobel Prize in Physics in 1979 for this work J 28 -'
It has proven difficult to construct quantum models of gravity, the remaining fundamental force. Semi-classical
approximations are workable, and have led to predictions such as Hawking radiation. However, the formulation of a
complete theory of quantum gravity is hindered by apparent incompatibilities between general relativity, the most
accurate theory of gravity currently known, and some of the fundamental assumptions of quantum theory. The
resolution of these incompatibilities is an area of active research, and theories such as string theory are among the
possible candidates for a future theory of quantum gravity. Classical mechanics has been extended into the complex
domain and complex classical mechanics exhibits behaviours similar to quantum mechanics.
Quantum mechanics and classical physics
Predictions of quantum mechanics have been verified experimentally to a very high degree of accuracy. According
to the correspondence principle between classical and quantum mechanics, all objects obey the laws of quantum
mechanics, and classical mechanics is just an approximation for large systems (or a statistical quantum mechanics of
a large collection of particles). The laws of classical mechanics thus follow from the laws of quantum mechanics as a
statistical average at the limit of large systems or large quantum numbers P 0 ^ However, chaotic systems do not have
good quantum numbers, and quantum chaos studies the relationship between classical and quantum descriptions in
these systems.
Quantum coherence is an essential difference between classical and quantum theories, and is illustrated by the
Einstein-Podolsky-Rosen paradox. Quantum interference involves the addition of probability amplitudes, whereas
when classical waves interfere there is an addition of intensities. For microscopic bodies, the extension of the system
is much smaller than the coherence length, which gives rise to long-range entanglement and other nonlocal
phenomena characteristic of quantum systems. 1 Quantum coherence is not typically evident at macroscopic scales,
although an exception to this rule can occur at extremely low temperatures, when quantum behavior can manifest
itself on more macroscopic scales (see Bose-Einstein condensate). This is in accordance with the following
observations:
• Many macroscopic properties of a classical system are a direct consequences of the quantum behavior of its parts.
For example, the stability of bulk matter (which consists of atoms and molecules which would quickly collapse
under electric forces alone), the rigidity of solids, and the mechanical, thermal, chemical, optical and magnetic
properties of matter are all results of the interaction of electric charges under the rules of quantum mechanics. 1 J
• While the seemingly exotic behavior of matter posited by quantum mechanics and relativity theory become more
apparent when dealing with extremely fast-moving or extremely tiny particles, the laws of classical Newtonian
physics remain accurate in predicting the behavior of large objects — of the order of the size of large molecules
and bigger — at velocities much smaller than the velocity of light.
Quantum mechanics
9
Relativity and quantum mechanics
Main articles: Quantum gravity and Theory of everything
Even with the defining postulates of both Einstein's theory of general relativity and quantum theory being
indisputably supported by rigorous and repeated empirical evidence and while they do not directly contradict each
other theoretically (at least with regard to primary claims), they are resistant to being incorporated within one
cohesive model P 4 ^
Einstein himself is well known for rejecting some of the claims of quantum mechanics. While clearly contributing to
the field, he did not accept the more philosophical consequences and interpretations of quantum mechanics, such as
the lack of deterministic causality and the assertion that a single subatomic particle can occupy numerous areas of
space at one time. He also was the first to notice some of the apparently exotic consequences of entanglement and
used them to formulate the Einstein-Podolsky-Rosen paradox, in the hope of showing that quantum mechanics had
unacceptable implications. This was 1935, but in 1964 it was shown by John Bell (see Bell inequality) that, although
Einstein was correct in identifying seemingly paradoxical implications of quantum mechanical nonlocality, these
implications could be experimentally tested. Alain Aspect's initial experiments in 1982, and many subsequent
experiements since, have verified quantum entanglement.
According to the paper of J. Bell and the Copenhagen interpretation (the common interpretation of quantum
mechanics by physicists since 1927), and contrary to Einstein's ideas, quantum mechanics was not at the same time
• a "realistic" theory
• and a local theory.
The Einstein-Podolsky-Rosen paradox shows in any case that there exist experiments by which one can measure the
state of one particle and instantaneously change the state of its entangled partner, although the two particles can be
an arbitrary distance apart; however, this effect does not violate causality, since no transfer of information happens.
Quantum entanglement is at the basis of quantum cryptography, with high-security commercial applications in
banking and government.
Gravity is negligible in many areas of particle physics, so that unification between general relativity and quantum
mechanics is not an urgent issue in those applications. However, the lack of a correct theory of quantum gravity is an
important issue in cosmology and physicists' search for an elegant "theory of everything". Thus, resolving the
inconsistencies between both theories has been a major goal of twentieth- and twenty-first-century physics. Many
prominent physicists, including Stephen Hawking, have labored in the attempt to discover a theory underlying
everything, combining not only different models of subatomic physics, but also deriving the universe's four
forces — the strong force, electromagnetism, weak force, and gravity — from a single force or phenomenon. One of
the leaders in this field is Edward Witten, a theoretical physicist who formulated the groundbreaking M-theory,
which is an attempt at describing the supersymmetrical based string theory.
Attempts at a unified field theory
As of 2010 the quest for unifying the fundamental forces through quantum mechanics is still ongoing. Quantum
electrodynamics (or "quantum electromagnetism"), which is currently (in the perturbative regime at least) the most
accurately tested physical theory, has been successfully merged with the weak nuclear force into the electro weak
force and work is currently being done to merge the electro weak and strong force into the electrostrong force.
Current predictions state that at around 10 14 GeV the three aforementioned forces are fused into a single unified
field, ^ Beyond this "grand unification," it is speculated that it may be possible to merge gravity with the other
19
three gauge symmetries, expected to occur at roughly 10 GeV. However — and while special relativity is
parsimoniously incorporated into quantum electrodynamics — the expanded general relativity, currently the best
theory describing the gravitation force, has not been fully incorporated into quantum theory.
Quantum mechanics
10
Philosophical implications
Since its inception, the many counter-intuitive results of quantum mechanics have provoked strong philosophical
debate and many interpretations. Even fundamental issues such as Max Born's basic rules concerning probability
amplitudes and probability distributions took decades to be appreciated.
Richard Feynman said, "I think I can safely say that nobody understands quantum mechanics.'
The Copenhagen interpretation, due largely to the Danish theoretical physicist Niels Bohr, is the interpretation of the
quantum mechanical formalism most widely accepted amongst physicists. According to it, the probabilistic nature of
quantum mechanics is not a temporary feature which will eventually be replaced by a deterministic theory, but
instead must be considered to be a final renunciation of the classical ideal of causality. In this interpretation, it is
believed that any well-defined application of the quantum mechanical formalism must always make reference to the
experimental arrangement, due to the complementarity nature of evidence obtained under different experimental
situations.
Albert Einstein, himself one of the founders of quantum theory, disliked this loss of determinism in measurement.
(This dislike is the source of his famous quote, "God does not play dice with the universe.") Einstein held that there
should be a local hidden variable theory underlying quantum mechanics and that, consequently, the present theory
was incomplete. He produced a series of objections to the theory, the most famous of which has become known as
the Einstein-Podolsky-Rosen paradox. John Bell showed that the EPR paradox led to experimentally testable
differences between quantum mechanics and local realistic theories. Experiments have been performed confirming
the accuracy of quantum mechanics, thus demonstrating that the physical world cannot be described by local realistic
T381
theories. The Bohr-Einstein debates provide a vibrant critique of the Copenhagen Interpretation from an
epistemological point of view.
The Everett many-worlds interpretation, formulated in 1956, holds that all the possibilities described by quantum
theory simultaneously occur in a multiverse composed of mostly independent parallel universes P 9 ^ This is not
accomplished by introducing some new axiom to quantum mechanics, but on the contrary by removing the axiom of
the collapse of the wave packet: All the possible consistent states of the measured system and the measuring
apparatus (including the observer) are present in a real physical (not just formally mathematical, as in other
interpretations) quantum superposition. Such a superposition of consistent state combinations of different systems is
called an entangled state. While the multiverse is deterministic, we perceive non-deterministic behavior governed by
probabilities, because we can observe only the universe, i.e. the consistent state contribution to the mentioned
superposition, we inhabit. Everett's interpretation is perfectly consistent with John Bell's experiments and makes
them intuitively understandable. However, according to the theory of quantum decoherence, the parallel universes
will never be accessible to us. This inaccessibility can be understood as follows: Once a measurement is done, the
measured system becomes entangled with both the physicist who measured it and a huge number of other particles,
some of which are photons flying away towards the other end of the universe; in order to prove that the wave
function did not collapse one would have to bring all these particles back and measure them again, together with the
system that was measured originally. This is completely impractical, but even if one could theoretically do this, it
would destroy any evidence that the original measurement took place (including the physicist's memory).
Quantum mechanics
11
Applications
Quantum mechanics had enormous success in explaining many of the features of our world. The individual
behaviour of the subatomic particles that make up all forms of matter — electrons, protons, neutrons, photons and
others — can often only be satisfactorily described using quantum mechanics. Quantum mechanics has strongly
influenced string theory, a candidate for a theory of everything (see reductionism) and the multi verse hypothesis.
Quantum mechanics is important for understanding how individual atoms combine covalently to form chemicals or
molecules. The application of quantum mechanics to chemistry is known as quantum chemistry. (Relativistic)
quantum mechanics can in principle mathematically describe most of chemistry. Quantum mechanics can provide
quantitative insight into ionic and covalent bonding processes by explicitly showing which molecules are
energetically favorable to which others, and by approximately how muchJ 40 ^ Most of the calculations performed in
computational chemistry rely on quantum mechanics J 4 ^
Much of modern technology operates
at a scale where quantum effects are
significant. Examples include the laser,
the transistor (and thus the microchip),
the electron microscope, and magnetic
resonance imaging. The study of
semiconductors led to the invention of
the diode and the transistor, which are
indispensable for modern electronics.
Researchers are currently seeking
robust methods of directly
manipulating quantum states. Efforts
are being made to develop quantum
cryptography, which will allow
guaranteed secure transmission of
information. A more distant goal is the
development of quantum computers,
which are expected to perform certain
computational tasks exponentially
faster than classical computers. Another active research topic is quantum teleportation, which deals with techniques
to transmit quantum information over arbitrary distances.
Quantum tunneling is vital in many devices, even in the simple light switch, as otherwise the electrons in the electric
current could not penetrate the potential barrier made up of a layer of oxide. Flash memory chips found in USB
drives use quantum tunneling to erase their memory cells.
Quantum mechanics primarily applies to the atomic regimes of matter and energy, but some systems exhibit
quantum mechanical effects on a large scale; superfluidity (the frictionless flow of a liquid at temperatures near
absolute zero) is one well-known example. Quantum theory also provides accurate descriptions for many previously
unexplained phenomena such as black body radiation and the stability of electron orbitals. It has also given insight
into the workings of many different biological systems, including smell receptors and protein structures.^ Recent
work on photosynthesis has provided evidence that quantum correlations play an essential role in this most
fundamental process of the plant kingdom J 43 ^ Even so, classical physics often can be a good approximation to
results otherwise obtained by quantum physics, typically in circumstances with large numbers of particles or large
quantum numbers. (However, some open questions remain in the field of quantum chaos.)
Result: |Bands+Transmission+CurrentDensity+IV ^|
0.3 -
o
id
— i
._. 0.2 -
>
1--,
* 0.1 -
iZ
LU
0 -
crated with RTD-NEGF on nanoHl
1 ' 1 ' 1 ' 1
0 12 3
Composite Plot Axis
1 Bands+Transmission+CurrentDensity+IV =1 Bias=0V
.dp
I
4
I
- c
<D
Cn
<D
Cn
E
Optio
A working mechanism of a resonant tunneling diode device, based on the phenomenon of
quantum tunneling through the potential barriers.
Quantum mechanics
12
Examples
Particle in a box
The particle in a 1 -dimensional potential energy box is the most simple
example where restraints lead to the quantization of energy levels. The
box is defined as having zero potential energy inside a certain region
and infinite potential energy everywhere outside that region. For the
1 -dimensional case in the x direction, the time-independent
Schrodinger equation can be written as:'- 44 -'
o L x
1 -dimensional potential energy box (or infinite
potential well)
- Eip.
2m dx 2
Writing the differential operator
•* d
Px = -in—
dx
the previous equation can be seen to be evocative of the classic analogue
— f = E
with E as me energy for the state if) , in this case coinciding with the kinetic energy of the particle.
The general solutions of the Schrodinger equation for the particle in a box are:
h 2 k 2
i>tx) = Ae ikx + Be~ ikx E = — -
2m
or, from Euler's formula,
ip(x) = C sin kx + D cos kx.
The presence of the walls of the box determines the values of C, D, and k. At each wall (x = 0 and x = L), xp = 0.
Thus when x - 0,
^(0) = 0 = Csin0 + Dcos0 = D
and so D = 0. When x - L,
ijj(L) = 0 = CsinfcL.
C cannot be zero, since this would conflict with the Born interpretation. Therefore sin kL = 0, and so it must be that
kL is an integer multiple of it. Therefore,
k = n = 1, 2, 3, ... .
L
The quantization of energy levels follows from this constraint on k, since
n 7r n n h
E
2mL 2 8mL 2 '
Quantum mechanics
13
Free particle
For example, consider a free particle.
In quantum mechanics, there is
wave-particle duality so the properties
of the particle can be described as the
properties of a wave. Therefore, its
quantum state can be represented as a
wave of arbitrary shape and extending
over space as a wave function. The
position and momentum of the particle
are observables. The Uncertainty
Principle states that both the position
and the momentum cannot
simultaneously be measured with full
precision at the same time. However,
one can measure the position alone of a
moving free particle creating an eigenstate of position with a wavefunction that is very large (a Dirac delta) at a
particular position x and zero everywhere else. If one performs a position measurement on such a wavefunction, the
result x will be obtained with 100% probability (full certainty). This is called an eigenstate of position
(mathematically more precise: a generalized position eigenstate (eigendistribution)). If the particle is in an eigenstate
of position then its momentum is completely unknown. On the other hand, if the particle is in an eigenstate of
momentum then its position is completely unknown J 45 ^ In an eigenstate of momentum having a plane wave form, it
can be shown that the wavelength is equal to h/p, where h is Planck's constant and p is the momentum of the
* * [46]
eigenstate.
Notes
[I] J. Mehra and H. Rechenberg, The historical development of quantum theory, Springer-Verlag, 1982.
[2] T.S. Kuhn, Black-body theory and the quantum discontinuity 1894-1912, Clarendon Press, Oxford, 1978.
[3] A. Einstein, Uber einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt (On a heuristic point of view
concerning the production and transformation of light), Annalen der Physik 17 (1905) 132-148 (reprinted in The collected papers of Albert
Einstein, John Stachel, editor, Princeton University Press, 1989, Vol. 2, pp. 149-166, in German; see also Einstein's early work on the
quantum hypothesis, ibid. pp. 134-148).
[4] "Merriam-Webster.com" (http://www.merriam-webster.com/dictionary/quantum). Merriam-Webster.com. 2010-08-13. . Retrieved
2010-10-15.
[5] Edwin Thall. "FCCJ.org" (http://mooni.fccj.org/~ethall/quantum/quant.htm). Mooni.fccj.org. . Retrieved 2010-10-15.
[6] Compare the list of conferences presented here (http://ysfine.com/).
[7] Oocities.com (http://web.archive.Org/web/20091026095410/http://geocities.com/mik_malm/quantmech.html)
[8] P.A.M. Dirac, The Principles of Quantum Mechanics, Clarendon Press, Oxford, 1930.
[9] J. von Neumann, Mathematische Grundlagen der Quantenmechanik, Springer, Berlin, 1932 (English translation: Mathematical Foundations
of Quantum Mechanics, Princeton University Press, 1955).
[10] Greiner, Walter; Mtiller, Berndt (1994). Quantum Mechanics Symmetries, Second edition (http://books.google.com/
books 7id=gCfvWx6vuzUC&pg=PA52). Springer-Verlag. p. 52. ISBN 3-540-58080-8. .,
[II] "AIP.org" (http://www.aip.org/history/heisenberg/p08a.htm). AIP.org. . Retrieved 2010-10-15.
[12] Greenstein, George; Zajonc, Arthur (2006). The Quantum Challenge: Modern Research on the Foundations of Quantum Mechanics, Second
edition (http://books.google.com/books ?id=5t0tm0FBlCsC&pg=PA215). Jones and Bartlett Publishers, Inc. p. 215. ISBN 0-7637-2470-X.
[13] probability clouds are approximate, but better than the Bohr model, whereby electron location is given by a probability function, the wave
function eigenvalue, such that the probability is the squared modulus of the complex amplitude
[14] "Actapress.com" (http://www.actapress.com/PaperInfo.aspx?PaperID=25988&reason=500). Actapress.com. . Retrieved 2010-10-15.
[15] Hirshleifer, Jack (2001). The Dark Side of the Force: Economic Foundations of Conflict Theory (http://books.google.com/
books?id=W2J2IXgiZVgC&pg=PA265). Campbridge University Press, p. 265. ISBN 0-521-80412-4. .,
Result: 1 3D Wavefunctions ^| Result: 1 3D Wavefunctions
3D confined electron wave functions for each eigenstate in a Quantum Dot. Here,
rectangular and triangular-shaped quantum dots are shown. Energy states in rectangular
dots are more 's-type' and 'p-type'. However, in a triangular dot the wave functions are
mixed due to confinement symmetry.
Quantum mechanics
14
[16] Dict.cc (http://www.dict.cc/german-english/eigen.html)
De .pons . eu (http :// de. pons . eu/ deutsch-englisch/eigen)
[17] "PHY.olemiss.edu" (http://www.phy.olemiss.edu/~luca/Topics/qm/collapse.html). PHY.olemiss.edu. 2010-08-16. . Retrieved
2010-10-15.
[18] "Farside.ph.utexas.edu" (http://farside.ph.utexas.edu/teaching/qmech/lectures/node28.html). Farside.ph.utexas.edu. . Retrieved
2010-10-15.
[19] "Reddit.com" (http://www.reddit.eom/r/philosophy/comments/8p2qv/determinism_and_naive_realism/). Reddit.com. 2009-06-01. .
Retrieved 2010-10-15.
[20] Michael Trott. "Time-Evolution of a Wavepacket in a Square Well — Wolfram Demonstrations Project" (http: //demonstrations. wolfram.
com/TimeEvolutionOfAWavepacketlnASquareWell/). Demonstrations.wolfram.com. . Retrieved 2010-10-15.
[21] Michael Trott. "Time Evolution of a Wavepacket In a Square Well" (http://demonstrations.wolfram.com/
TimeEvolutionOfAWavepacketlnASquareWell/). Demonstrations.wolfram.com. . Retrieved 2010-10-15.
[22] Mathews, Piravonu Mathews; Venkatesan, K. (1976). A Textbook of Quantum Mechanics (http://books.google.com/
books ?id=_qzslDD3TcsC&pg=PA36). Tata McGraw-Hill. p. 36. ISBN 0-07-096510-2. .,
[23] "Wave Functions and the Schrodinger Equation" (http://physics.ukzn.ac.za/~petruccione/Physl20/Wave Functions and the Schrodinger
Equation.pdf) (PDF). . Retrieved 2010-10-15.
[24] "Spaceandmotion.com" (http://www.spaceandmotion.com/physics-quantum-mechanics-werner-heisenberg.htm). Spaceandmotion.com. .
Retrieved 2010-10-15.
[25] Especially since Werner Heisenberg was awarded the Nobel Prize in Physics in 1932 for the creation of quantum mechanics, the role of Max
Born has been obfuscated. A 2005 biography of Born details his role as the creator of the matrix formulation of quantum mechanics. This was
recognized in a paper by Heisenberg, in 1940, honoring Max Planck. See: Nancy Thorndike Greenspan, "The End of the Certain World: The
Life and Science of Max Born" (Basic Books, 2005), pp. 124 - 128, and 285 - 286.
[26] "IF.uj.edu.pl" (http://th-www.if.uj.edu.pl/acta/voll9/pdf/vl9p0683.pdf) (PDF). . Retrieved 2010-10-15.
[27] "OCW.ssu.edu" (http://ocw.usu.edu/physics/classical-mechanics/pdf_lectures/06.pdf) (PDF). . Retrieved 2010-10-15.
[28] "The Nobel Prize in Physics 1979" (http://nobelprize.org/nobel_prizes/physics/laureates/1979/index.html). Nobel Foundation. .
Retrieved 2010-02-16.
[29] Complex Elliptic Pendulum (http://arxiv.org/abs/1001.0131), Carl M. Bender, Daniel W. Hook, Karta Kooner
[30] "Scribd.com" (http://www.scribd.com/doc/5998949/Quantum-mechanics-course-iwhatisquantummechanics). Scribd.com. 2008-09-14. .
Retrieved 2010-10-15.
[31] Philsci-archive.pitt.edu (http://philsci-archive.pitt.edu/archive/00002328/01/handbook.pdf)
[32] "Academic.brooklyn.cuny.edu" (http://academic.brooklyn.cuny.edu/physics/sobel/Nucphys/atomprop.html).
Academic.brooklyn.cuny.edu. . Retrieved 2010-10-15.
[33] "Cambridge.org" (http://assets.cambridge.org/97805218/29526/excerpt/9780521829526_excerpt.pdf) (PDF). . Retrieved 2010-10-15.
[34] "There is as yet no logically consistent and complete relativistic quantum field theory.", p. 4. — V. B. Berestetskii, E. M. Lifshitz, L P
Pitaevskii (1971). J. B. Sykes, J. S. Bell (translators). Relativistic Quantum Theory 4, part I. Course of Theoretical Physics (Landau and
Lifshitz) ISBN 0080160255
[35] "Life on the lattice: The most accurate theory we have" (http://latticeqcd.blogspot.com/2005/06/most-accurate-theory-we-have.html).
Latticeqcd.blogspot.com. 2005-06-03. . Retrieved 2010-10-15.
[36] Parker, B. (1993). Overcoming some of the problems, pp. 259-279.
[37] The Character of Physical Law (1965) Ch. 6; also quoted in The New Quantum Universe (2003) by Tony Hey and Patrick Walters
[38] "Plato.stanford.edu" (http://plato.stanford.edu/entries/qm-action-distance/). Plato.stanford.edu. 2007-01-26. . Retrieved 2010-10-15.
[39] "Plato.stanford.edu" (http://plato.stanford.edu/entries/qm-everett/). Plato.stanford.edu. . Retrieved 2010-10-15.
[40] "Books.google.com" (http: //books. google.com/books ?id=vdXU6SD4_UYC). Books.google.com. . Retrieved 2010-10-23.
[4 1 ] " en. wikiboos . org " (http :// en. wikibooks . org/ wiki/ Computational_chemistry/ Applications_of_molecular_quantum_mechanics) .
En.wikibooks.org. . Retrieved 2010-10-23.
[42] Anderson, Mark (2009-01-13). "Discovermagazine.com" (http://discovermagazine.com/2009/feb/
13-is-quantum-mechanics-controlling-your-thoughts/article_view?b_start:int=l&-C). Discovermagazine.com. . Retrieved 2010-10-23.
[43] "Quantum mechanics boosts photosynthesis" (http://physicsworld.com/cws/article/news/41632). physicsworld.com. . Retrieved
2010-10-23.
[44] Derivation of particle in a box, chemistry.tidalswan.com (http://chemistry. tidalswan.com/index.php ?title=Quantum_Mechanics)
[45] Davies, P. C. W.; Betts, David S. (1984). Quantum Mechanics, Second edition (http://books. google. com/books ?id=XRyHCrGNstoC&
pg=PA79). Chapman and Hall. p. 79. ISBN 0-7487-4446-0. .,
[46] "Books.Google.com" (http ://books. google. com/books ?id=tKm-Ekwke_UC). Books.Google.com. 2007-08-30. . Retrieved 2010-10-23.
Quantum mechanics
15
References
The following titles, all by working physicists, attempt to communicate quantum theory to lay people, using a
minimum of technical apparatus.
• Chester, Marvin (1987) Primer of Quantum Mechanics. John Wiley. ISBN 0-486-42878-8
• Richard Feynman, 1985. QED: The Strange Theory of Light and Matter, Princeton University Press. ISBN
0-691-08388-6. Four elementary lectures on quantum electrodynamics and quantum field theory, yet containing
many insights for the expert.
• Ghirardi, GianCarlo, 2004. Sneaking a Look at God's Cards, Gerald Malsbary, trans. Princeton Univ. Press. The
most technical of the works cited here. Passages using algebra, trigonometry, and bra-ket notation can be passed
over on a first reading.
• N. David Mermin, 1990, "Spooky actions at a distance: mysteries of the QT" in his Boojums all the way through.
Cambridge University Press: 110-76.
• Victor Stenger, 2000. Timeless Reality: Symmetry, Simplicity, and Multiple Universes. Buffalo NY: Prometheus
Books. Chpts. 5-8. Includes cosmological and philosophical considerations.
More technical:
• Bryce DeWitt, R. Neill Graham, eds., 1973. The Many-Worlds Interpretation of Quantum Mechanics, Princeton
Series in Physics, Princeton University Press. ISBN 0-691-081 3 1-X
• Dirac, P. A. M. (1930). The Principles of Quantum Mechanics. ISBN 01985201 15. The beginning chapters make
up a very clear and comprehensible introduction.
• Hugh Everett, 1957, "Relative State Formulation of Quantum Mechanics," Reviews of Modern Physics 29:
454-62.
• Feynman, Richard P.; Leighton, Robert B.; Sands, Matthew (1965). The Feynman Lectures on Physics. 1-3.
Addison- Wesley. ISBN 0738200085.
• Griffiths, David J. (2004). Introduction to Quantum Mechanics (2nd ed). Prentice Hall. ISBN 0-13-1 1 1892-7.
OCLC 40251748. A standard undergraduate text.
• Max Jammer, 1966. The Conceptual Development of Quantum Mechanics. McGraw Hill.
• Hagen Kleinert, 2004. Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets,
3rd ed. Singapore: World Scientific. Draft of 4th edition, (http://www.physik.fu-berlin.de/~kleinert/b5)
• Gunther Ludwig, 1968. Wave Mechanics. London: Pergamon Press. ISBN 0-08-203204-1
• George Mackey (2004). The mathematical foundations of quantum mechanics. Dover Publications. ISBN
0-486-43517-2.
• Albert Messiah, 1966. Quantum Mechanics (Vol. I), English translation from French by G. M. Temmer. North
Holland, John Wiley & Sons. Cf. chpt. IV, section III.
• Omnes, Roland (1999). Understanding Quantum Mechanics. Princeton University Press. ISBN 0-691-00435-8.
OCLC 39849482.
• Scerri, Eric R., 2006. The Periodic Table: Its Story and Its Significance. Oxford University Press. Considers the
extent to which chemistry and the periodic system have been reduced to quantum mechanics. ISBN
0-19-530573-6
• Transnational College of Lex (1996). What is Quantum Mechanics? A Physics Adventure. Language Research
Foundation, Boston. ISBN 0-9643504-1-6. OCLC 34661512.
• von Neumann, John (1955). Mathematical Foundations of Quantum Mechanics. Princeton University Press.
ISBN 0691028931.
• Hermann Weyl, 1950. The Theory of Groups and Quantum Mechanics, Dover Publications.
• D. Greenberger, K. Hentschel, F. Weinert, eds., 2009. Compendium of quantum physics, Concepts, experiments,
history and philosophy, Springer- Verlag, Berlin, Heidelberg.
Quantum mechanics
16
Further reading
• Bernstein, Jeremy (2009). Quantum Leaps (http://books. google. com/books 7id=jOMe3brYOLOC&
printsec=frontcover). Cambridge, Massachusetts: Belknap Press of Harvard University Press.
ISBN 9780674035416.
• Bohm, David (1989). Quantum Theory. Dover Publications. ISBN 0-486-65969-0.
• Eisberg, Robert; Resnick, Robert (1985). Quantum Physics of Atoms, Molecules, Solids, Nuclei, and Particles
(2nd ed). Wiley. ISBN 0-471-87373-X.
• Liboff, Richard L. (2002). Introductory Quantum Mechanics. Addison- Wesley. ISBN 0-8053-8714-5.
• Merzbacher, Eugen (1998). Quantum Mechanics. Wiley, John & Sons, Inc. ISBN 0-471-88702-1.
• Sakurai, J. J. (1994). Modern Quantum Mechanics. Addison Wesley. ISBN 0-201-53929-2.
• Shankar, R. (1994). Principles of Quantum Mechanics. Springer. ISBN 0-306-44790-8.
External links
• The Modern Revolution in Physics (http://www.lightandmatter.com/html_books/6mr/ch01/ch01.html) - an
online textbook.
• J. O'Connor and E. F. Robertson: A history of quantum mechanics, (http://www-history.mcs.st-andrews.ac.uk/
history/HistTopics/The_Quantum_age_begins . html)
• Introduction to Quantum Theory at Quantiki. (http://www.quantiki.org/wiki/index.php/
Introduction_to_Quantum_Theory)
• Quantum Physics Made Relatively Simple (http://bethe.cornell.edu/): three video lectures by Hans Bethe
• H is for h-bar. (http://www.nonlocal.com/hbar/)
• Quantum Mechanics Books Collection (http://www.freebookcentre.net/Physics/Quantum-Mechanics-Books.
html): Collection of free books
Course material
• Doron Cohen: Lecture notes in Quantum Mechanics (comprehensive, with advanced topics), (http://arxiv.org/
abs/quant-ph/0605180)
• MIT OpenCourseWare: Chemistry (http://ocw.mit.edu/OcwWeb/Chemistry/index.htm).
• MIT OpenCourseWare: Physics (http://ocw.mit.edu/OcwWeb/Physics/index.htm). See 8.04 (http://ocw.
mit.edu/OcwWeb/Physics/8-04Spring-2006/CourseHome/index.htm)
• Stanford Continuing Education PHY 25: Quantum Mechanics (http://www.youtube.eom/stanford#g/c/
84C10A9CB1D13841) by Leonard Susskind, see course description (http://continuingstudies.stanford.edu/
courses/course.php?cid=20072_PHY 25) Fall 2007
• 5V2 Examples in Quantum Mechanics (http://www.physics.csbsju.edu/QM/)
• Imperial College Quantum Mechanics Course, (http://www.imperial.ac.uk/quantuminformation/qi/tutorials)
• Spark Notes - Quantum Physics, (http://www.sparknotes.com/testprep/books/sat2/physics/
chapter 1 9 section3 .rhtml)
• Quantum Physics Online : interactive introduction to quantum mechanics (RS applets). (http://www.
quantum-physic s . poly technique . f r/)
• Experiments to the foundations of quantum physics with single photons. (http://www.didaktik.physik.
uni-erlangen.de/quantumlab/english/index.html)
• Motion Mountain, Volume IV (http://www.motionmountain.net/download.html) - A modern introduction to
quantum theory, with several animations.
• AQME (http://www.nanohub.org/topics/AQME) : Advancing Quantum Mechanics for Engineers — by
T.Barzso, D.Vasileska and G.Klimeck online learning resource with simulation tools on nanohub
• Quantum Mechanics (http://www.lsr.ph.ic.ac.uk/~plenio/lecture.pdf) by Martin Plenio
• Quantum Mechanics (http://farside.ph.utexas.edu/teaching/qm/389.pdf) by Richard Fitzpatrick
Quantum mechanics
17
• Online course on Quantum Transport (http://nanohub.org/resources/2039)
FAQs
• Many-worlds or relative- state interpretation, (http://www.hedweb.com/manworld.htm)
• Measurement in Quantum mechanics, (http://www.mtnmath.com/faq/meas-qm.html)
Media
• Lectures on Quantum Mechanics by Leonard Susskind (http://www.youtube.com/
view_play_list?p=84C 1 0A9CB 1 D 1 3 84 1 )
• Everything you wanted to know about the quantum world (http://www.newscientist.com/channel/
fundamentals/quantum- world) — archive of articles from New Scientist.
• Quantum Physics Research (http://www.sciencedaily.com/news/matter_energy/quantum_physics/) from
Science Daily
• Overbye, Dennis (December 27, 2005). "Quantum Trickery: Testing Einstein's Strangest Theory" (http://www.
nytimes.com/2005/12/27/science/27eins.html?ex=1293339600&en=caf5d835203c3500&ei=5090). The
New York Times. Retrieved April 12, 2010.
• Audio: Astronomy Cast (http://www.astronomycast.com/physics/ep-138-quantum-mechanics/) Quantum
Mechanics — June 2009. Fraser Cain interviews Pamela L. Gay.
Philosophy
• "Quantum Mechanics" (http://plato.stanford.edu/entries/qm) entry by Jenann Ismael. in the Stanford
Encyclopedia of Philosophy
• "Measurement in Quantum Theory" (http://plato.stanford.edu/entries/qm) entry by Henry Krips. in the
Stanford Encyclopedia of Philosophy
Schrodinger equation
In physics, specifically quantum mechanics, the Schrodinger equation, formulated in 1926 by Austrian physicist
Erwin Schrodinger, is an equation that describes how the quantum state of a physical system changes in time. It is as
central to quantum mechanics as Newton's laws are to classical mechanics.
dt
Two forms of the Schrodinger equation
In the standard interpretation of quantum mechanics, the quantum state, also called a wavefunction or state vector, is
the most complete description that can be given to a physical system. Solutions to Schrodinger's equation describe
not only molecular, atomic and subatomic systems, but also macroscopic systems, possibly even the whole universe.
[1]
The most general form is the time-dependent Schrodinger equation, which gives a description of a system evolving
with time. For systems in a stationary state, the time-independent Schrodinger equation is sufficient. Approximate
solutions to the time-independent Schrodinger equation are commonly used to calculate the energy levels and other
properties of atoms and molecules.
Schrodinger's equation can be mathematically transformed into Werner Heisenberg's matrix mechanics, and into
Richard Feynman's path integral formulation. The Schrodinger equation describes time in a way that is inconvenient
for relativistic theories, a problem which is not as severe in matrix mechanics and completely absent in the path
integral formulation.
Schrodinger equation
18
The Schrodinger equation
The Schrodinger equation takes several different forms, depending on the physical situation. This section presents
the equation for the general case and for the simple case encountered in many textbooks.
General quantum system
For a general quantum system: ^
d
ih—^ =
at
where
• \Jris the wave function; the probability amplitude for different configurations of the system at different times,
d
• ifi — is the energy operator ( i is the imaginary unit and ft is the reduced Planck constant),
dt
• is the Hamiltonian operator.
Single particle in a potential
For a single particle with potential energy V in position space, the Schrodinger equation takes the form.
•ft^*(r, f) = H9 = (-|^V 2 + V{r)j tt(r, t) = -^V 2 *<r, t)+V(r)*(r, t)
where
h 2
• V 2 ^ s me kinetic energy operator, where m is the mass of the particle.
2m
Q2 Q2 Q2
• V 2 * s me Laplace operator. In three dimensions, the Laplace operator is ^ -| ^ -| ^, where x, y,
dx 2 dy 2 dz l
and z are the Cartesian coordinates of space.
• V (r) is the time-independent potential energy at the position r.
• ^(r, t) is the probability amplitude for the particle to be found at position r at time t.
• H = ^— V 2 + V(r) i s me Hamiltonian operator for a single particle in a potential.
Time independent or stationary equation
The time independent equation, again for a single particle with potential energy V takes the form:'- 4 -'
h 2
Eif>(r) = -— Vfy(r) + V(r)if>(r).
2m
This equation describes the standing wave solutions of the time-dependent equation, which are the states with
definite energy.
Schrodinger equation
19
Historical background and development
Following Max Planck's quantization of light (see black body radiation), Albert Einstein interpreted Planck's
quantum to be photons, particles of light, and proposed that the energy of a photon is proportional to its frequency,
one of the first signs of wave-particle duality. Since energy and momentum are related in the same way as frequency
and wavenumber in special relativity, it followed that the momentum p of a photon is proportional to its wavenumber
k.
h fit
p — — — nk
A
Louis de Broglie hypothesized that this is true for all particles, even particles such as electrons. Assuming that the
waves travel roughly along classical paths, he showed that they form standing waves for certain discrete frequencies.
These correspond to discrete energy levels, which reproduced the old quantum condition
Following up on these ideas, Schrodinger decided to find a proper wave equation for the electron. He was guided by
William R. Hamilton's analogy between mechanics and optics, encoded in the observation that the zero-wavelength
limit of optics resembles a mechanical system — the trajectories of light rays become sharp tracks which obey
Fermat's principle, an analog of the principle of least action J 6 ^ A modern version of his reasoning is reproduced in
the next section. The equation he found is:
d h 2
ih^(x, t) = -^V 2 *(x, t) + V(x)*(x, t).
Using this equation, Schrodinger computed the hydrogen spectral series by treating a hydrogen atom's electron as a
wave W(x, t), moving in a potential well V, created by the proton. This computation accurately reproduced the energy
levels of the Bohr model.
However, by that time, Arnold Sommerfeld had refined the Bohr model with relativistic corrections ^
Schrodinger used the relativistic energy momentum relation to find what is now known as the Klein-Gordon
equation in a Coulomb potential (in natural units):
He found the standing waves of this relativistic equation, but the relativistic corrections disagreed with Sommerfeld's
formula. Discouraged, he put away his calculations and secluded himself in an isolated mountain cabin with a
lover. 1
While at the cabin, Schrodinger decided that his earlier non-relativistic calculations were novel enough to publish,
and decided to leave off the problem of relativistic corrections for the future. He put together his wave equation and
the spectral analysis of hydrogen in a paper in 1926.^ 10 ^ The paper was enthusiastically endorsed by Einstein, who
saw the matter-waves as an intuitive depiction of nature, as opposed to Heisenberg's matrix mechanics, which he
considered overly formal.^ 1 ^
The Schrodinger equation details the behaviour of ip but says nothing of its nature. Schrodinger tried to interpret it as
ri2i
a charge density in his fourth paper, but he was unsuccessful. In 1926, just a few days after Schrodinger' s fourth
ri3i
and final paper was published, Max Born successfully interpreted xp as a probability amplitude. Schrodinger,
though, always opposed a statistical or probabilistic approach, with its associated discontinuities — much like
Einstein, who believed that quantum mechanics was a statistical approximation to an underlying deterministic
theory — and never reconciled with the Copenhagen interpretation J- 14]
Schrodinger equation
20
Derivation
Short heuristic derivation
Schrodinger's equation can be derived in the following short heuristic way.
Assumptions
1 . The total energy E of a particle is
E = T + V=^- + V.
This is the classical expression for a particle with mass m where the total energy E is the sum of the kinetic
energy T, and the potential energy V (which can vary with position, and time), p and m are respectively the
momentum and the mass of the particle.
2. Einstein's light quanta hypothesis of 1905, which asserts that the energy E of a photon is proportional to the
frequency v (or angular frequency, co = 2jtv) of the corresponding electromagnetic wave:
E — hi/ — Tiuj ,
3. The de Broglie hypothesis of 1924, which states that any particle can be associated with a wave, and that the
momentum p of the particle is related to the wavelength A (or wavenumber k) of such a wave by:
p = X = hk 5
Expressing p and k as vectors, we have
p = Kk .
4. The three assumptions above allow one to derive the equation for plane waves only. To conclude that it is true in
general requires the superposition principle, and thus, one must separately postulate that the Schrodinger equation
is linear.
Expressing the wave function as a complex plane wave
Schrodinger's idea was to express the phase of a plane wave as a complex phase factor:
and to realize that since
d
— * = -im^
dt
then
d
= Tiu)^ = ih—^
at
and similarly since
d
OX
and
d 2
dx 2
we find:
so that, again for a plane wave, he obtained:
Schrodinger equation
21
And, by inserting these expressions for the energy and momentum into the classical formula we started with, we get
Schrodinger's famed equation, for a single particle in the 3-dimensional case in the presence of a potential V:
d h 2
at 2m
Versions
There are several equations that go by Schrodinger's name:
Time dependent equation
This is the equation of motion for the quantum state. In the most general form, it is written: ^ 15 ^
ih^(x, t) = HV(x, t)
where jj is a linear operator acting on the wavefunction For the specific case of a single particle in one
dimension moving under the influence of a potential vJ 15 ^
t) = _ A- |!Ltt(x, t) + V(x)*(x, t)
and the operator jj can be read off:
For a particle in three dimensions, the only difference is more derivatives:
d h 2
ih— y, z, t) = -— V 2 *(x 5 y, z, t) + V(x, y, z)9(x, y, z, t)
and for N particles, the difference is that the wavefunction is in 3,/V-dimensional configuration space, the space of all
possible particle positions J 16 ^
This last equation is in a very high dimension, so that the solutions are not easy to visualize.
Time independent equation
This is the equation for the standing waves, the eigenvalue equation for . In abstract form, for a general quantum
system, it is written: ^ 15 ^
= Eip.
For a particle in one dimension,
2
But there is a further restriction — the solution must not grow at infinity, so that it has either a finite L -norm (if it is
ri7i
a bound state) or a slowly diverging norm (if it is part of a continuum).
' 2 = / Mx)\ 2 dx.
For example, when there is no potential, the equation reads:^ 18]
-Eib = ir
* 2m dx 2
Schrodinger equation
22
which has oscillatory solutions for E > 0 (the are arbitrary constants):
and exponential solutions for E < 0
1>-\B\(x) = C ie V 2 ™|£|/& 2 * + C2e -sfi^EUtf*
The exponentially growing solutions have an infinite norm, and are not physical. They are not allowed in a finite
volume with periodic or fixed boundary conditions.
For a constant potential V the solution is oscillatory for E > V and exponential for E < V, corresponding to energies
which are allowed or disallowed in classical mechanics. Oscillatory solutions have a classically allowed energy and
correspond to actual classical motions, while the exponential solutions have a disallowed energy and describe a small
amount of quantum bleeding into the classically disallowed region, to quantum tunneling. If the potential V grows at
infinity, the motion is classically confined to a finite region, which means that in quantum mechanics every solution
becomes an exponential far enough away. The condition that the exponential is decreasing restricts the energy levels
to a discrete set, called the allowed energies.
Nonlinear equation
ri9i
The nonlinear Schrodinger equation is the partial differential equation (in dimensionless form)
id t if; = - ^ +
for the complex field ip(xj).
ri9i
This equation arises from the Hamiltonian 1
H=jdx [^i 2 +^r
with the Poisson brackets
{^(x)^( y )} = {r(x) : r(y)} = o
{ip*(x),ifj(y)} = iS(x-y).
It must be noted that this is a classical field equation. Unlike its linear counterpart, it never describes the time
evolution of a quantum state.
Properties
The Schrodinger equation has certain properties.
Local conservation of probability
The probability density of a particle is ^*(x 5 t) • The probability flux is defined as [in units of
(probability )/(area x time)]:
j = (**V* - = — Im (tf*V* ) .
The probability flux satisfies the continuity equation:
^P(x,i)+V-j = 0
where P(z; 5 t) is the probability density [measured in units of (probability )/( volume)]. This equation is the
mathematical equivalent of the probability conservation law.
For a plane wave:
(re, t) = Ae* kx -^
Schrodinger equation
23
j(x,t) = \Af™.
m
So that not only is the probability of finding the particle the same everywhere, but the probability flux is as expected
from an object moving at the classical velocity p/m. The reason that the Schrodinger equation admits a probability
flux is because all the hopping is local and forward in time.
Relativity
The Schrodinger equation does not take into account relativistic effects; as a wave equation, it is invariant under a
Galilean transformation, but not under a Lorentz transformation. But in order to include relativity, the physical
picture must be altered.
The Klein-Gordon equation uses the relativistic mass-energy relation:
E 2 =p 2 c 2 +m 2 c l
to produce the differential equation:
which is relativistically invariant.
Solutions
Some general techniques are:
• Perturbation theory
• The variational method
• Quantum Monte Carlo methods
• Density functional theory
• The WKB approximation and semi-classical expansion
In some special cases, special methods can be used:
• List of quantum-mechanical systems with analytical solutions
• Hartree-Fock method and post Hartree-Fock methods
• Discrete delta-potential method
Notes
[I] Schrodinger, E. (1926). "An Undulatory Theory of the Mechanics of Atoms and Molecules" (http://home.tiscali.nl/physis/HistoricPaper/
Schroedinger/Schroedingerl926c.pdf). Physical Review 28 (6): 1049-1070. doi: 10.1 103/PhysRev.28. 1049. .
[2] Shankar, R. (1994). Principles of Quantum Mechanics (2nd ed.). Kluwer Academic/Plenum Publishers, p. 143. ISBN 978-0-306-44790-7.
[3] Shankar, R. (1994). Principles of Quantum Mechanics (2nd ed.). Kluwer Academic/Plenum Publishers, p. 143ff. ISBN 978-0-306-44790-7.
[4] Shankar, R. (1994). Principles of Quantum Mechanics (2nd ed.). Kluwer Academic/Plenum Publishers, p. 145. ISBN 978-0-306-44790-7.
[5] de Broglie, L. (1925). "Recherches sur la theorie des quanta [On the Theory of Quanta]" (http://tel.archives-ouvertes.fr/docs/00/04/70/
78/PDF/tel-00006807.pdf). Annales de Physique 10 (3): 22-128. . Translated version (http://www.ensmp.fr/aflb/LDB-oeuvres/
De_Broglie_Kracklauer . pdf) .
[6] Schrodinger, E. (1984). Collected papers. Friedrich Vieweg und Sohn. ISBN 3700105738. See introduction to first 1926 paper.
[7] Sommerfeld, A. (1919). Atombau und Spektrallinien. Braunschweig: Friedrich Vieweg und Sohn. ISBN 3871444847.
[8] For an English source, see Haar, T.. The Old Quantum Theory.
[9] Rhodes, R. (1986). Making of the Atomic Bomb. Touchstone. ISBN 0-671-44133-7.
[10] Schrodinger, E. (1926). "Quantisierung als Eigenwertproblem; von Erwin Schrodinger" (http://gallica.bnf. fr/ark:/l 2 148/bpt6kl 53811.
image. langFR.f373. pagination). Annalen der Physik, (Leipzig): 361-377. .
[II] Einstein, A.; et. al.. Letters on Wave Mechanics: Schrodinger-Planck-Einstein-Lorentz.
[12] Moore, W.J. (1992). Schrodinger: Life and Thought. Cambridge University Press, p. 219. ISBN 0-521-43767-9.
[13] Moore, W.J. (1992). Schrodinger: Life and Thought. Cambridge University Press, p. 220. ISBN 0-521-43767-9.
Schrodinger equation
24
[14] It is clear that even in his last year of life, as shown in a letter to Max Born, that Schrodinger never accepted the Copenhagen interpretation.
cf p. 220 Moore, WJ. (1992). Schrodinger: Life and Thought. Cambridge University Press, p. 479. ISBN 0-521-43767-9.
[15] Shankar, R. (1994). Principles of Quantum Mechanics. Kluwer Academic/Plenum Publishers, pp. U3jf. ISBN 978-0-306-44790-7.
[16] Shankar, R. (1994). Principles of Quantum Mechanics. Kluwer Academic/Plenum Publishers, p. 141. ISBN 978-0-306-44790-7.
[17] Feynman, R.P.; Leighton, R.B.; Sand, M. (1964). "Operators". The Feynman Lectures on Physics. 3. Addison- Wesley, pp. 20-7.
ISBN 0201021 153.
[18] Shankar, R. (1994). Principles of Quantum Mechanics. Kluwer Academic/Plenum Publishers, pp. \5lff. ISBN 978-0-306-44790-7.
[19] V.E. Zakharov; S.V. Manakov (1974). "On the complete integrability of a nonlinear Schrodinger equation". Journal of Theoretical and
Mathematical Physics 19 (3): 551-559. doi:10.1007/BF01035568. Originally in: Teoreticheskaya i Matematicheskaya Fizika 19 (3): 332-343.
June, 1974
References
• Paul Adrien Maurice Dirac (1958). The Principles of Quantum Mechanics (4th ed.). Oxford University Press.
• David J. Griffiths (2004). Introduction to Quantum Mechanics (2nd ed.). Benjamin Cummings.
ISBN 0131244051.
• Richard Liboff (2002). Introductory Quantum Mechanics (4th ed.). Addison Wesley. ISBN 0805387145.
• David Halliday (2007). Fundamentals of Physics (8th ed.). Wiley. ISBN 0471 159506.
• Serway, Moses, and Moyer (2004). Modern Physics (3rd ed.). Brooks Cole. ISBN 0534493408.
• Walter John Moore (1992). Schrodinger: Life and Thought. Cambridge University Press. ISBN 0521437679.
• Schrodinger, Erwin (December 1926). "An Undulatory Theory of the Mechanics of Atoms and Molecules". Phys.
Rev. 28 (6) 28: 1049-1070. doi:10.1103/PhysRev.28.1049.
External links
• Quantum Physics (http://www.lightandmatter.com/html_books/0sn/chl3/chl3.html) - a textbook with a
treatment of the time-independent Schrodinger equation
• Linear Schrodinger Equation (http://eqworld.ipmnet.ru/en/solutions/lpde/lpdel08.pdf) at EqWorld: The
World of Mathematical Equations.
• Nonlinear Schrodinger Equation (http://eqworld.ipmnet.ru/en/solutions/npde/npdel403.pdf) at EqWorld:
The World of Mathematical Equations.
• The Schrodinger Equation in One Dimension (http://www.colorado.edu/UCB/AcademicAffairs/ArtsSciences/
physics/TZD/PageProofsl/TAYL07-203-247.I.pdf) as well as the directory of the book (http://www.
colorado.edu/UCB/AcademicAffairs/ArtsSciences/physics/TZD/PageProofsl/).
• All about 3D Schrodinger Equation (http://hyperphysics.phy-astr.gsu.edu/hbase/hframe.html)
• Mathematical aspects of Schrodinger equations are discussed on the Dispersive PDE Wiki (http://tosio.math.
toronto . edu/ wiki/index . php/Main_Page) .
• Web-Schrodinger: Interactive solution of the 2D time dependent Schrodinger equation (http://www.
nano technology . hu/online/ web- schroedinger/index . html)
• An alternate derivation of the Schrodinger Equation (http://behindtheguesses.blogspot.com/2009/06/
schrodinger-equation-corrections.html)
• Online software- Periodic Potential Lab (http://nanohub.org/resources/3847) Solves the time independent
Schrodinger equation for arbitrary periodic potentials.
Dirac equation
25
Dirac equation
The Dirac equation is a relativistic quantum mechanical wave equation formulated by British physicist Paul Dirac
in 1928. It provides a description of elementary spin-Yz particles, such as electrons, consistent with both the
principles of quantum mechanics and the theory of special relativity. The equation demands the existence of
antiparticles and actually predated their experimental discovery. This made the discovery of the positron, the
antiparticle of the electron, one of the greatest triumphs of modern theoretical physics.
Mathematical formulation
The Dirac equation in the form originally proposed by Dirac is:
3
(3mc + 2^ ot-kPk c j *) = in — ^ —
where
m is the rest mass of the electron,
c is the speed of light,
p is the momentum operator,
x and t are the space and time coordinates,
h = hllit is the reduced Planck constant, also known as Dirac's constant.
The new elements in this equation are the 4x4 matrices <2fc and (3 , and the four-component wavefunction %j) . The
matrices are all Hermitian and have squares equal to the identity matrix:
<*\ = P = h
and they all mutually anticommute: {g^ 5 ^j} = Oand |a^ ; /?} = 0 . Explicitly,
a { {3 = -j3a h
where i and j are distinct and range from 1 to 3. These matrices, and the form of the wavefunction, have a deep
mathematical significance. The algebraic structure represented by the Dirac matrices had been created some 50 years
earlier by the English mathematician W. K. Clifford. In turn, Clifford's formulation had been based on the mid- 19th
century work of the German mathematician Hermann Grassmann in his "Lineare Ausdehnungslehre" (Theory of
Linear Extensions). The latter had been regarded as well-nigh incomprehensible by most of his contemporaries. The
appearance of something so seemingly abstract, at such a late date, in such a direct physical manner, is one of the
most remarkable chapters in the history of physics.
The commutation rules are designed so that a solution of Dirac's equation will automatically also be a solution of
((mc 2 ) 2 + ^( Pfe c) 2 )V = £;V
k = l
which is the relativistic energy-momentum equation.
Dirac equation
26
Comparison with the Schrodinger equation
The Dirac equation is superficially similar to the Schrodinger equation for a free particle:
The left side represents the square of the momentum operator divided by twice the mass, which is the nonrelativistic
kinetic energy. A relativistic generalization of this equation requires that space and time derivatives must enter
symmetrically, as they do in the relativistic Maxwell equations — the derivatives must be of the same order in space
and time. In relativity, the momentum and the energy are the space and time parts of a geometrical space-time
vector, the 4-momentum, and they are related by the relativistically invariant relation
7712
& 2 2 2
— - p =m c
which says that the length of this vector is the rest mass m. Replacing E and p by {fi — and —ifiS? as Schrodinger
dt
theory requires, we get a relativistic equation:
and the wave function 0 is a relativistic scalar: a complex number which has the same numerical value in all
frames. Because the equation is second order in the time derivative, one must specify the initial values of not only
0 , but also of dt<j) . This is normal for classical waves, where the initial conditions are the position and velocity.
However, in quantum mechanics, the wavefunction is supposed to be the complete description. That is, just knowing
the wavefunction should determine the future.
In the Schrodinger theory, the probability density is given by the positive definite expression
P = </>*</>
and its current by
ih
J = -—(<f>*VJ>-<i>V<!>*)
and the conservation of probability density has a local form:
In a relativistic theory, the form of the probability density and the current must form a four vector, so the form of the
probability density can be found from the current just by replacing \7 by d t
ih
p=—(d>*d t d>-<i>d t <j)*).
Everything is relativistic now, but the probability density is not positive definite, because the initial values of both (j)
and dt(j) can be freely chosen. This expression reduces to Schrodinger' s density and current for superpositions of
positive frequency waves whose wavelength is long compared to the Compton wavelength, that is, for nonrelativistic
motions. It reduces to a negative definite quantity for superpositions of negative frequency waves only. It mixes up
both signs when forces which have an appreciable amplitude to produce relativistic motions are involved, at which
point scattering can produce particles and antiparticles.
Although it was not a successful description of a single particle, this equation is resurrected in quantum field theory,
where it is known as the Klein-Gordon equation, and describes a relativistic spin-0 complex field. The non-positive
probability density and current are the charge-density and current, while the particles are described by a
mode-expansion.
To interpret the Klein-Gordon equation as an equation for the probability amplitude for a single particle at a given
position, negative frequency solutions must be interpreted as describing the particle travelling backwards in time,
Dirac equation
27
propagating into the past.
The equation with this interpretation does not predict the future from the present except in the nonrelativistic limit,
rather it places a global constraint on the amplitudes. This can be used to construct a perturbation expansion with
particles zipping backwards and forwards in time, the Feynman diagrams, but it does not allow a straightforward
wavefunction description, since each particle has its own separate proper time. Ultimately, the entity the
Klein-Gordon equation (and Dirac equation) acts on is a field, not a wavefunction. The fields are physical fields
whose values are observables. They are not probability amplitudes.
Dirac's coup
Thinking in terms of wavefunctions rather than fields, Dirac reasoned that the necessary equation is first-order in
both space and time. One could formally take the relativistic expression for the energy E = C\Jp 2 + m 2 c 2 i
replace p by its operator equivalent, expand the square root in an infinite series of derivative operators, set up an
eigenvalue problem, then solve the equation formally by iterations. Most physicists had little faith in such a process,
even if it were technically possible.
As the story goes, Dirac was staring into the fireplace at Cambridge, pondering this problem, when he hit upon the
idea of taking the square root of the wave operator thus:
1 ^2
V ' - = ( Ad * + Bd v + Cd * + -Dd t ){Ad x + Bd y + Cd z + -Ddt).
c z ot z c c
On multiplying out the right side we see that, to get all the cross-terms such as d x d y to vanish, we must assume
AB + BA = 0, ...
with
A 2 = B 2 = . . . = 1.
Dirac, who had just then been intensely involved with working out the foundations of Heisenberg's matrix
mechanics, immediately understood that these conditions could be met if A, B... are matrices, with the implication
that the wave function has multiple components. This immediately explained the appearance of two-component wave
functions in Pauli's phenomenological theory of spin, something that up until then had been regarded as mysterious,
even to Pauli himself. However, one needs at least 4x4 matrices to set up a system with the properties desired — so
the wave function had four components, not two, as in the Pauli theory.
Given the factorization in terms of these matrices, one can now write down immediately an equation
(Ad x + Bd y + Cd z + -Dd t )^ = Kip
c
with k to be determined. Applying again the matrix operator on either side yields
{v 2 - = *v
On taking k = mc/Ti we find that all the components of the wave function individually satisfy the relativistic
energy-momentum relation. Thus the sought-for equation that is first-order in both space and time is
1 TfiC
(Ad x + Bd y + cd z + -Dd t - -r)ip = o.
c h
With (j4 ? 5, C) = i/3ak and D — (3 , we get the Dirac equation.
Dirac equation
28
Comparison with the Pauli theory
The necessity of introducing half-integral spin goes back experimentally to the results of the Stern-Gerlach
experiment. A beam of atoms is run through a strong inhomogeneous magnetic field, which then splits into N parts
depending on the intrinsic angular momentum of the atoms. It was found that for silver atoms, the beam was split in
two — the ground state therefore could not be integral, because even if the intrinsic angular momentum of the atoms
were as small as possible, 1, the beam would be split into 3 parts, corresponding to atoms with L =-1,0, and +1.
l z
The conclusion is that silver atoms have net intrinsic angular momentum of Pauli set up a theory which explained
this splitting by introducing a two-component wave function and a corresponding correction term in the
Hamiltonian, representing a semi-classical coupling of this wave function to an applied magnetic field, as so:
2
Here A** is the applied electromagnetic field, and the three sigmas are Pauli matrices, eis the charge of the
particle, e.g. e = — eofor the electron. On squaring out the first term, a residual interaction with the magnetic field
is found, along with the usual Hamiltonian of a charged particle interacting with an applied field:
H = — [p- -A) + eA 0 -
2m V c ) 2mc
This Hamiltonian is now a 2 x 2 matrix, so the Schrodinger equation based on it,
dd
H<p = ih-£
must use a two-component wave function. Pauli had introduced the sigma matrices
0 1\ (0 -i
i 0
as pure phenomenology — Dirac now had a theoretical argument that implied that spin was somehow the
consequence of the marriage of quantum theory to relativity.
The Pauli matrices share the same properties as the Dirac matrices — they are all Hermitian, square to 1, and
anticommute. This allows one to immediately find a representation of the Dirac matrices in terms of the Pauli
matrices:
(0 <Tk\
' \<Tk 0 J
The Dirac equation now may be written as an equation coupling two-component spinors:
/ mc 2 ca-p\f <(> + \ _ d_ ( 0+\
\ca ■ p -m<?) \<t>_) ~ m dt \4>-J ■
Notice that on the diagonal we find the rest energy of the particle. If we set the momentum to zero — that is, bring the
particle to rest — then we have
( mc 2
= V 0
The equations for the individual two-spinors are now decoupled, and we see that the "top" and "bottom" two-spinors
are individually eigenf unctions of the energy with eigenvalues equal to plus and minus the rest energy, respectively.
The appearance of this negative energy eigenvalue is completely consistent with relativity.
It should be strongly emphasized that this separation in the rest frame is not an invariant statement — the "bottom"
two-spinor does not represent antimatter as such in general. The entire four-component spinor represents an
irreducible whole — in general, states will have an admixture of positive and negative energy components. If we
Dirac equation
29
couple the Dirac equation to an electromagnetic field, as in the Pauli theory, then the positive and negative energy
parts will be mixed together, even if they are originally decoupled. Dirac's main problem was to find a consistent
interpretation of this mixing. As we shall see below, it brings a new phenomenon into physics — matter/antimatter
creation and annihilation.
Covariant form and relativistic invariance
The explicitly covariant form of the Dirac equation is (employing the Einstein summation convention):
—thrf'd^ + mcij) — 0.
In the above, 7^ are the Dirac matrices. -y°is Hermitian, and the j k are anti-Hermitian, with the definition
7°=/?
7 = 7 a .
Thus
7 =
/ 1 0
0 1
0 0
\ 0 0
0
0
-1
0
\
0
0
0
- 1 /
,7 =
/ 0
0
0
V-i
0
0
-1
0
0
1
0
0
0
0
0/
,7 2 =
/ 0
0
0
0 0
0 i
1 0
0 0
0
0
0
/
,7 =
\
0
0
-1
0
This may be summarized using the Minkowski metric on spacetime in the form
where the bracket expression {a, b} means ab + ba , the anticommutator. These are the defining relations of a
Clifford algebra over a pseudo-orthogonal 4-d space with metric signature (-| ) . (Note that one may also
employ the metric form ( 1- ++) by multiplying all the gammas by a factor of { .) The specific Clifford algebra
employed in the Dirac equation is known as the Dirac algebra.
The Dirac equation may be interpreted as an eigenvalue expression, where the rest mass is proportional to an
eigenvalue of the 4-momentum operator, the proportion being the speed of light in vacuo:
In practice, physicists often use units of measure such that ft and c are equal to 1, known as "natural" units. The
equation then takes the simple form
(-ij fA d fl + m)i/; = 0
or, if Feynman slash notation is employed,
(-i$ + m)ip = 0.
A fundamental theorem states that if two distinct sets of matrices are given that both satisfy the Clifford relations,
then they are connected to each other by a similarity transformation:
y = s~ l ^s.
If in addition the matrices are all unitary, as are the Dirac set, then S itself is unitary;
y = ifl<fu.
The transformation U is unique up to a multiplicative factor of absolute value 1. Let us now imagine a Lorentz
transformation to have been performed on the derivative operators, which form a covariant vector. For the operator
7^ dp to remain invariant, the gammas must transform among themselves as a contravariant vector with respect to
their spacetime index. These new gammas will themselves satisfy the Clifford relations, because of the orthogonality
of the Lorentz transformation. By the fundamental theorem, we may replace the new set by the old set subject to a
unitary transformation. In the new frame, remembering that the rest mass is a relativistic scalar, the Dirac equation
will then take the form
0 1
0 0
0 0
1 0
Dirac equation
30
(-irf'fU&p + ™)ip(x', t') = 0
U^-i-ffy + m)U^(x',t') = 0.
If we now define the transformed spinor
if/ = Uip
then we have the transformed Dirac equation
H 7 ^ + m)VV, t f ) = 0.
Thus, once we settle on a unitary representation of the gammas, it is final provided we transform the spinor
according the unitary transformation that corresponds to the given Lorentz transformation.
These considerations reveal the origin of the gammas in geometry, hearkening back to Grassmann's original
motivation - they represent a fixed basis of unit vectors in spacetime. Similarly, products of the gammas such as
7m T^represent oriented surface elements, and so on. With this in mind, we can find the form the unit volume
element on spacetime in terms of the gammas as follows. By definition, it is
v = ^wtVtV •
For this to be an invariant, the epsilon symbol must be a tensor, and so must contain a factor of yjg , where g is the
determinant of the metric tensor. Since this is negative, that factor is imaginary. Thus
V = ij 7 7 7 .
This matrix is given the special symbol 75, owing to its importance when one is considering improper
transformations of spacetime, that is, those that change the orientation of the basis vectors. In the representation we
are using for the gammas, it is
75 = (/ 2 o 2 )-
Also note that we could as easily have taken the negative square root of the determinant of g - the choice amounts to
an initial handedness convention.
Lorentz Invariance of the Dirac equation
The Lorentz invariance of the Dirac equation follows from its co variant nature.
Comparison with the Klein-Gordon equation
Using the Feynman slash notation, the Klein-Gordon equation can be factored:
0 = (d 2 + m 2 )^ = (f + m 2 )ip = (i$ + m)(-i$ + m)^ .
The last factor is simply the Dirac equation. Hence any solution to the Dirac equation is automatically a solution to
the Klein-Gordon equation:
+ m)iP = 0^(d 2 + m 2 )V> = 0 .
But the converse is not true; not all solutions to the Klein-Gordon equation solve the Dirac equation.
Dirac equation
31
Adjoint equation and Dirac current
By defining the adjoint spinor
where ^ is the conjugate transpose of ip , and noticing that
{7 M)t 7 o = yy,
we obtain, by taking the Hermitian conjugate of the Dirac equation and multiplying from the right by -y 0 , the
adjoint equation:
j>{i-fdp + m) = 0
where is understood to act to the left. Multiplying the Dirac equation by ^ from the left, and the adjoint
equation by ij) from the right, and adding, produces the law of conservation of the Dirac current in covariant form:
dp (f^ip) = 0.
Now we see the great advantage of the first-order equation over the one Schrodinger had tried - this is the conserved
current density required by relativistic in variance, only now its 0-component is positive definite:
The Dirac equation and its adjoint are the Euler-Lagrange equations of the 4-d invariant action integral
S = J Ld\
where d A x = dt dx dy dz , and the scalar L is the Dirac Lagrangian
ih -
L = m*H> - -(VYX^VO - (d^W)
and for the purposes of variation, if) and are regarded as independent fields. The relativistic invariance also
follows immediately from the variational principle.
Coupling to an electromagnetic field
To consider problems in which an applied electromagnetic field interacts with the particles described by the Dirac
equation, one uses the correspondence principle, and takes over into the theory the corresponding expression from
classical mechanics, whereby the total momentum of a charged particle in an external field is modified as so:
P — > p — -A ,
c
(where q is the charge of the particle; for example, q = — e < Ofor an electron). In natural units, the Dirac
equation then takes the form
+ iqAp) + m}^ = 0.
The validity of this prescription has been confirmed experimentally with great precision. It is known as minimal
coupling, and is found throughout particle physics. Indeed, while the introduction of the electromagnetic field in this
way is essentially phenomenological in this context, it arises from a fundamental principle in quantum field theory.
Now as stated above, the transformation U is defined only up to a phase factor e i0 . Also, the fundamental
observable of the Dirac theory, the current, is unchanged if we multiply the field (which is not a wavefunction) by an
arbitrary phase. Because the field is not a wavefunction, this phase invariance has a different physical meaning from
the phase invariance of probability amplitudes. We may exploit this to get the form of the mutual interaction of a
Dirac particle and the electromagnetic field, as opposed to simply considering a Dirac particle in an applied field, by
assuming this arbitrary phase factor to depend continuously on position:
Notice now that
Dirac equation
32
To preserve minimal coupling, we must add to the potential a term proportional to the gradient of the phase. But we
know from electrodynamics that this leaves the electromagnetic field itself invariant. The value of the phase is
arbitrary, but not how it changes from place to place. This is the starting point of gauge theory, which is the main
principle on which quantum field theory is based. The simplest such theory, and the one most thoroughly
understood, is known as quantum electrodynamics. The equations of field theory thus have invariance under both
Lorentz transformations and gauge transformations.
Curved spacetime Dirac equation
The Dirac equation can be written in curved spacetime using vierbein fields. Vierbeins describe a local frame that
enables to define Dirac matrices at every point. Contracting these matrices with vierbeins give the right
transformation properties. This way Dirac's equation takes the following form in curved spacetime ^ :
-vfe^D^ + m^f = 0.
Here e^is the vierbein and D^is the covariant derivative for fermion fields, defined as follows
where ^]ac is the Lorentzian metric, a ab is the commutator of Dirac matrices:
ab 1 a b
and is the spin connection:
where is the Christoffel symbol. Note that here, Latin letters denote the "Lorentzian" indices and Greek ones
denote "Riemannian" indices.
Physical interpretation
The Dirac theory, while providing a wealth of information that is accurately confirmed by experiments, nevertheless
introduces a new physical paradigm that appears at first difficult to interpret and even paradoxical. Some of these
issues of interpretation must be regarded as open questions. Here we will see how the Dirac theory brilliantly
answered some of the outstanding issues in physics at the time it was put forward, while posing others that are still
the subject of debate.
Identification of observables
The critical physical question in a quantum theory is - what are the physically observable quantities defined by the
theory? According to general principles, such quantities are defined by Hermitian operators that act on the Hilbert
space of possible states of a system. The eigenvalues of these operators are then the possible results of measuring the
corresponding physical quantity. In the Schrodinger theory, the simplest such object is the overall Hamiltonian,
which represents the total energy of the system. If we wish to maintain this interpretation on passing to the Dirac
theory, we must take the Hamiltonian to be
H = 7 ° (mc 2 + c j2l k (Pk- ~A k ) c) + qA°.
\ k=l C J
This looks promising, because we see by inspection the rest energy of the particle and, in case A = 0> me energy
of a charge placed in an electric potential qA°- What about the term involving the vector potential? In classical
electrodynamics, the energy of a charge moving in an applied potential is
Dirac equation
33
H = Cyj (p - ^A) 2 + m 2 c 2 + qA°.
Thus the Dirac Hamiltonian is fundamentally distinguished from its classical counterpart, and we must take great
care to correctly identify what is an observable in this theory. Much of the apparent paradoxical behavior implied by
the Dirac equation amounts to a misidentification of these observables. Let us now describe one such effect, (cont'd)
History
Since the Dirac equation was originally invented to describe the electron, we will generally speak of "electrons" in
this article. The equation also applies to quarks, which are also elementary spin-^ particles. A modified Dirac
equation can be used to approximately describe protons and neutrons, which are not elementary particles (they are
made up of quarks), but have a net spin of V2. Another modification of the Dirac equation, called the Majorana
equation, is thought to describe neutrinos — also spin- 1 /! particles.
The Dirac equation describes the probability amplitudes for a single electron. This is a single-particle theory; in other
words, it does not account for the creation and destruction of the particles, and for the ultimate need to switch from
the Dirac equation for wavefunctions to the physically distinct Dirac equation for fields. It gives a good prediction of
the magnetic moment of the electron and explains much of the fine structure observed in atomic spectral lines. It also
explains the spin of the electron. Two of the four solutions of the equation correspond to the two spin states of the
electron. The other two solutions make the peculiar prediction that there exist an infinite set of quantum states in
which the electron possesses negative energy. This strange result led Dirac to predict, via a remarkable hypothesis
known as "hole theory," the existence of particles behaving like positively-charged electrons. Dirac thought at first
these particles might be protons. He was chagrined when the strict prediction of his equation (which actually
specifies particles of the same mass as the electron) was verified by the discovery of the positron in 1932. When
asked later why he hadn't actually boldly predicted the yet unfound positron with its correct mass, Dirac answered
"Pure cowardice!" He shared the Nobel Prize anyway, in 1933.
A similar equation for spin 3/2 particles is called the Rarita-Sch winger equation.
Hole theory
The negative E solutions found in the preceding section are problematic, for it was assumed that the particle has a
positive energy. Mathematically speaking, however, there seems to be no reason for us to reject the negative-energy
solutions. Since they exist, we cannot simply ignore them, for once we include the interaction between the electron
and the electromagnetic field, any electron placed in a positive-energy eigenstate would decay into negative-energy
eigenstates of successively lower energy by emitting excess energy in the form of photons. Real electrons obviously
do not behave in this way.
To cope with this problem, Dirac introduced the hypothesis, known as hole theory, that the vacuum is the
many-body quantum state in which all the negative-energy electron eigenstates are occupied. This description of the
vacuum as a "sea" of electrons is called the Dirac sea. Since the Pauli exclusion principle forbids electrons from
occupying the same state, any additional electron would be forced to occupy a positive-energy eigenstate, and
positive-energy electrons would be forbidden from decaying into negative-energy eigenstates.
Dirac further reasoned that if the negative-energy eigenstates are incompletely filled, each unoccupied eigenstate -
called a hole - would behave like a positively charged particle. The hole possesses a positive energy, since energy is
required to create a particle-hole pair from the vacuum. As noted above, Dirac initially thought that the hole might
be the proton, but Hermann Weyl pointed out that the hole should behave as if it had the same mass as an electron,
whereas the proton is over 1800 times heavier. The hole was eventually identified as the positron, experimentally
discovered by Carl Anderson in 1932.
Dirac equation
34
It is not entirely satisfactory to describe the "vacuum" using an infinite sea of negative-energy electrons. The
infinitely negative contributions from the sea of negative-energy electrons has to be canceled by an infinite positive
"bare" energy and the contribution to the charge density and current coming from the sea of negative-energy
electrons is exactly canceled by an infinite positive "jellium" background so that the net electric charge density of the
vacuum is zero. In quantum field theory, a Bogoliubov transformation on the creation and annihilation operators
(turning an occupied negative-energy electron state into an unoccupied positive energy positron state and an
unoccupied negative-energy electron state into an occupied positive energy positron state) allows us to bypass the
Dirac sea formalism even though, formally, it is equivalent to it.
In certain applications of condensed matter physics, however, the underlying concepts of "hole theory" are valid. The
sea of conduction electrons in an electrical conductor, called a Fermi sea, contains electrons with energies up to the
chemical potential of the system. An unfilled state in the Fermi sea behaves like a positively-charged electron,
though it is referred to as a "hole" rather than a "positron". The negative charge of the Fermi sea is balanced by the
positively-charged ionic lattice of the material.
Dirac bilinears
There are five different (neutral) Dirac bilinear terms not involving any derivatives:
• (S)calar: (scalar, P-even)
• (P)seudoscalar: ^^ip (scalar, P-odd)
• (V)ector: ip^if) (vector, P-odd)
• (A)xial: ^^^y 5 ^ (vector, P-even)
• (T)ensor: ^a^ijj (antisymmetric tensor, P-even),
where G w = % - [y, y]and 7 5 = 75 = ^c^atVVT* = hVtV-
A Dirac mass term is an S coupling. A Yukawa coupling may be S or P. The electromagnetic coupling is V. The
weak interactions are V-A.
See also
• Bohr-Sommerfeld theory
• Breit equation
• Dirac field
• Einstein-Maxwell-Dirac equations
• Feynman checkerboard
• Foldy-Wouthuysen transformation
• Klein-Gordon equation
• Quantum electrodynamics
• Rarita-Schwinger equation
• Theoretical and experimental justification for the Schrodinger equation
• The Dirac Equation appears on the floor of Westminster Abbey. It appears on the plaque commemorating Paul
Dirac's life which was inaugurated on November 13, 1995 .
Dirac equation
35
References
[1] Lawrie, Ian D.. A Unified Grand Tour of Theoretical Physics.
[2] http : / / www . dirac . ch/PaulDirac . html
Selected papers
• P.A.M. Dirac "The Quantum Theory of the Electron", Proc. R. Soc. A (1928) vol. 1 17, no 778, 610-624 (http://
dx.doi.org/10. 1098/rspa. 1928.0023)
• P.A.M. Dirac "The Quantum Theory of the Electron", Proc. R. Soc. A117) (http://gallica.bnf.fr/ark:/12148/
bpt6k562109) link to the volume of the Proceedings of the Royal Society of London containing the article at page
610
• P.A.M. Dirac "A Theory of Electrons and Protons", Proc. R. Soc. A126) (http://gallica.bnf.fr/ark:/12148/
bpt6k56219d/f388.table) link to the volume of the Proceedings of the Royal Society of London containing the
article at page 360
• CD. Anderson, Phys. Rev. 43, 491 (1933)
• R. Frisch and O. Stern, Z. Phys. 85, 4 (1933)
• R. Chen, New Exact Solution of Dirac-Coulomb Equation with Exact Boundary Condition. International Journal
of Theoretical Physics 47, 881 (2008).
Textbooks
• Halzen, Francis; Martin, Alan (1984). Quarks & Leptons: An Introductory Course in Modern Particle Physics.
John Wiley & Sons. ISBN.
• Dirac, P.A.M., Principles of Quantum Mechanics, 4th edition (Clarendon, 1982)
• Shankar, R., Principles of Quantum Mechanics, 2nd edition (Plenum, 1994)
• Bjorken, J D & Drell, S, Relativistic Quantum mechanics
• Thaller, B., The Dirac Equation, Texts and Monographs in Physics (Springer, 1992)
• Schiff, L.I., Quantum Mechanics, 3rd edition (McGraw-Hill, 1968)
• Griffiths, D.J., Introduction to Elementary Particles, 2nd edition (Wiley-VCH, 2008) ISBN 978-3-527-40601-2.
External links
• The Dirac Equation (http://www.mathpages.com/home/kmath654/kmath654.htm) at MathPages
• The Nature of the Dirac Equation, its solutions and Spin (http://www.mc.maricopa.edu/~kevinlg/i256/
Nature_Dirac . pdf)
• Dirac equation for a spin Vi particle (http://electron6.phys.utk.edu/qm2/modules/m9/dirac.htm)
• Unitary Principle and Mechanical Non-solution Proving of Neomorphic Dirac Equation (http://commons.
wikimedia. org/wiki/
File:Unitary_Principle_and_Mechanical_Non-solution_Proving_of_Neomorphic_Dirac_Equation.pdf)
• Pedagogic Aids to Quantum Field Theory (http://www.quantumfieldtheory.info) click on Chap. 4 for a
step-by-small-step introduction to the Dirac equation, spinors, and relativistic spin/helicity operators.
Klein-Gordon equation
36
Klein-Gordon equation
The Klein-Gordon equation (Klein-Fock-Gordon equation or sometimes Klein-Gordon-Fock equation) is a
relativistic version of the Schrodinger equation.
It is the equation of motion of a quantum scalar or pseudoscalar field, a field whose quanta are spinless particles. It
cannot be straightforwardly interpreted as a Schrodinger equation for a quantum state, because it is second order in
time and because it does not admit a positive definite conserved probability density. Still, with the appropriate
interpretation, it does describe the quantum amplitude for finding a point particle in various places, the relativistic
wavefunction, but the particle propagates both forwards and backwards in time. Any solution to the Dirac equation is
automatically a solution to the Klein-Gordon equation, but the converse is not true.
Statement
The Klein-Gordon equation is
,2 J2
?5? *-vV+- jr * = a
It is most often written in natural units:
The form is determined by requiring that plane wave solutions of the equation:
obey the energy momentum relation of special relativity:
- PflP r = E 2 -P 2 = u; 2 -k 2 = -k^ = m 2
Unlike the Schrodinger equation, there are two values of uj for each k, one positive and one negative. Only by
separating out the positive and negative frequency parts does the equation describe a relativistic wavefunction. For
the time-independent case, the Klein-Gordon equation becomes
v 2 -
m 2 c 2
V>(r) = 0
which is the homogeneous screened Poisson equation.
History
The equation was named after the physicists Oskar Klein and Walter Gordon, who in 1927 proposed that it describes
relativistic electrons. Although it turned out that the Dirac equation describes the spinning electron, the Klein
Gordon equation correctly describes the spinless pion. The pion is a composite particle; no spinless elementary
particles have yet been found, although the Higgs boson is theorized to exist as a spin-zero boson, according to the
Standard Model.
The Klein-Gordon equation was first considered as a quantum wave equation by Schrodinger in his search for an
equation describing de Broglie waves. The equation is found in his notebooks from late 1925, and he appears to have
prepared a manuscript applying it to the hydrogen atom. Yet, without taking into account the electron's spin, the
Klein-Gordon equation predicts the hydrogen atom's fine structure incorrectly, including overestimating the overall
magnitude of the splitting pattern by a factor of 4n/(2n — l)for the ft-th energy level. In January 1926,
Schrodinger submitted for publication instead his equation, a non-relativistic approximation that predicts the Bohr
energy levels of hydrogen without fine structure.
In 1926, soon after the Schrodinger equation was introduced, Vladimir Fock wrote an article about its generalization
for the case of magnetic fields, where forces were dependent on velocity, and independently derived this equation.
Klein-Gordon equation
37
Both Klein and Fock used Kaluza and Klein's method. Fock also determined the gauge theory for the wave equation.
The Klein-Gordon equation for a free particle has a simple plane wave solution.
Derivation
The non-relativistic equation for the energy of a free particle is
p 2
2m
By quantizing this, we get the non-relativistic Schrodinger equation for a free particle,
p 2 d
where
p = —ihV
is the momentum operator ( \7 being the del operator).
The Schrodinger equation suffers from not being relativistically covariant, meaning it does not take into account
Einstein's special relativity.
It is natural to try to use the identity from special relativity
Vp 2 c 2 + m 2 c 4 = E
for the energy; then, just inserting the quantum mechanical momentum operator, yields the equation
d
ut
This, however, is a cumbersome expression to work with because the differential operator cannot be evaluated while
under the square root sign. In addition, this equation, as it stands, is nonlocal.
Klein and Gordon instead began with the square of the above identity, i.e.
p 2 c 2 + m 2 c 4 = £ 2
which, when quantized, gives
{{-iHVfc 2 + m 2 c 4 )V = (ift^)V
ot
which simplifies to
(dty
Rearranging terms yields
i a 2
$ - v V + '^^^ = o
2^2
m c
c 2 (dt) 2 ^ r h 2
Since all reference to imaginary numbers has been eliminated from this equation, it can be applied to fields that are
real valued as well as those that have complex values.
Using the reciprocal of the Minkowski metric diag(— c 2 , 1, 1, 1) , we get
777 2 f 2
-rTd^iP + — = 0
a
in covariant notation. This is often abbreviated as
{□ + ^ = o,
where
rnc
Klein-Gordon equation
38
and
n 1 a 2 -
□ = v .
c 2 dt 2
This operator is called the d'Alembert operator. Today this form is interpreted as the relativistic field equation for a
scalar (i.e. spin-0) particle. Furthermore, any solution to the Dirac equation (for a spin-one-half particle) is
automatically a solution to the Klein-Gordon equation, though not all solutions of the Klein-Gordon equation are
solutions of the Dirac equation. It is noteworthy that the Klein-Gordon equation is very similar to the Proca
equation.
Relativistic free particle solution
The Klein-Gordon equation for a free particle can be written as
with the same solution as in the non-relativistic case:
V>(r, t) = e^ 1 -^
except with the constraint
2 u 2 m 2 c 2
c 2 h 2 "
Just as with the non-relativistic particle, we have for energy and momentum:
< P > = (VI - ihVty) = fik,
(E) = (i>\ih^\l>) = hw.
Except that now when we solve for k and oo and substitute into the constraint equation, we recover the relationship
between energy and momentum for relativistic massive particles:
(E) 2 = m 2 c 4 + <p) 2 c 2 .
For massless particles, we may set m = 0 in the above equations. We then recover the relationship between energy
and momentum for massless particles:
(E) = (\p\)c
Action
The Klein-Gordon equation can also be derived from the following action
where ij) is the Klein-Gordon field and m is its mass. The complex conjugate of ip is written ^.If the scalar
field is taken to be real- valued, then ^ = ijj.
From this we can derive the stress-energy tensor of the scalar field. It is
i-2
Klein-Gordon equation
39
Electromagnetic interaction
There is a simple way to make any field interact with electromagnetism in a gauge invariant way: replace the
derivative operators with the gauge co variant derivative operators. The Klein Gordon equation becomes:
=-{d t - ieA 0 ) 2 (f> + {di - %eAif<f> = m 2 <f>
in natural units, where A is the vector potential. While it is possible to add many higher order terms, for example,
D^D^tp + AF» v D^D v {D a D a $) = 0
these terms are not renormalizable in 3+1 dimensions.
The field equation for a charged scalar field multiplies by i, which means the field must be complex. In order for a
field to be charged, it must have two components that can rotate into each other, the real and imaginary parts.
The action for a charged scalar is the covariant version of the uncharged action:
S = f (drf* + ieA^*)(d v <S> - ieA v <\>)-rf v = f |D0| 2
J x J X
Gravitational interaction
In general relativity, we include the effect of gravity and the Klein-Gordon equation becomes
-1 m 2 c 2
or equivalently
^ 2 Jl
o = -srvjjrf +
id 2 r 2
ft
where g a P is the reciprocal of the metric tensor that is the gravitational potential field, 9 is the determinant of the
metric tensor, is the covariant derivative and T a ^ is the Christoffel symbol that is the gravitational force field.
References
• Sakurai, J. J. (1967). Advanced Quantum Mechanics. Addison Wesley. ISBN 0-201-06710-2.
• Davydov, A.S. (1976). Quantum Mechanics, 2nd Edition. Pergamon. ISBN 0-08-020437-6.
External links
• Weisstein, Eric W., "Klein-Gordon equation from Math World.
• Linear Klein-Gordon Equation at EqWorld: The World of Mathematical Equations.
• Nonlinear Klein-Gordon Equation at EqWorld: The World of Mathematical Equations.
References
[ 1 ] http : / / math world, wolfram. com/Klein-GordonEquation . html
[2] http://eqworld.ipmnet.ru/en/solutions/lpde/lpde203.pdf
[3 1 http : / / eqworld . ipmnet . ru/ en/ solutions/ npde/ npde2 1 07 . pdf
Einstein-Maxwell-Dirac equations
40
Einstein-Maxwell-Dirac equations
Einstein-Maxwell-Dirac equations (EMD) are related to quantum field theory. The current Big Bang Model is a
quantum field theory in a curved spacetime. Unfortunately, no such theory is mathematically well-defined; in spite
of this, theoreticians claim to extract information from this hypothetical theory. On the other hand, the
super-classical limit of the not mathematically well-defined QED in a curved spacetime is the mathematically
well-defined Einstein-Maxwell-Dirac system. (One could get a similar system for the standard model.) As a super
theory, EMD violates the positivity condition in the Penrose-Hawking Singularity Theorem. Thus, it is possible that
there would be complete solutions without any singularities- Yau has in fact constructed some. Furthermore, it is
known that the Einstein-Maxwell-Dirac system admits of solitonic solutions, i.e., classical electrons and photons.
This is the kind of theory Einstein was hoping for. EMD is also a totally geometricized theory as a non-commutative
geometry; here, the charge e and the mass m of the electron are geometric invariants of the non-commutative
geometry analogous to pi.
One way of trying to construct a rigorous QED and beyond is to attempt to apply the deformation quantization
program to MD, and more generally, EMD. This would involve the following.
Program for SCESM
The Super-Classical Einstein-Standard Model:
• 1. Extend Asymptotic Completeness, Global Existence and the Infrared Problem for the Maxwell-Dirac Equations
to SCESM (Memoirs of the American Mathematical Society), by M. Flato, Jacques C. H. Simon, Erik Taflin
(http://www.amazon.com/Asymptotic-Completeness-Existence-Maxwell-Dirac-Mathematical/dp/
0821806831/ref=sr_l_l?ie=UTF8&s=books&qid=1240926988&sr=l-l).
• 2. Show that the positivity condition in the Penrose-Hawking singularity theorem is violated for the SCESM.
Construct smooth solutions to SCESM having Dark Stars. See here: The Large Scale Structure of Space-Time by
Stephen W. Hawking, G. F. R. Ellis (http://www.amazon.com/
Structure-Space-Time-Cambridge-Monographs-Mathematical/dp/0521099064/ref=pd_bbs_sr_lie=UTF8&
s=books&qid=1240927769&sr=8-l)
• 3. Follow three substeps
• i. Derive approximate history of the universe from SCESM - both analytically and via computer simulation.
• ii. Compare with ESM (the QSM in a curved space-time).
• iii. Compare with observation. See: Cosmology by Steven Weinberg (http://www.amazon.com/
Cosmology-StevenWeinberg/dp/O^
• 4. Show that the solution space to SCESM, F, is a reasonable infinite dimensional super- sympletic manifold. See:
Supersymmetry for Mathematicians: An Introduction (http://www.amazon.com/
Supersymmetry-Mathematicians-Introduction-Courant-Lecture/dp/082 1 835742/ref=sr_l_5 ?ie=UTF8&
s=books&qid= 1 2408940 1 1 &sr= 1 -5 )
• 5. Apply deformation quantization to F to obtain mathematically rigorous definition of SQESM (quantum version
of SCESM). See:
Deformation Theory and Symplectic Geometry by Daniel Sternheimer, John Rawnsley (http://www.amazon.com/
Deformation-Symplectic-Geometry-Mathematical-Physics/ dp/ 0792345258/ ref=sr_l_l?ie=UTF8& s=books&
qid=1240930131&sr=l-l)
• 6. Derive history of the universe from SQESM and compare with observation.
Einstein-Maxwell-Dirac equations
41
References
• http://arxiv.org/PS_cache/gr-qc/pdf/9801/9801079v3.pdf
• http://arxiv.org/PS_cache/gr-qc/pdf/98 10/98 10048v4.pdf
• http://arxiv.org/PS_cache/gr-qc/pdf/0005/0005028v3.pdf
• http://deepblue.lib.urnich.edU/bitstream/2027.42/49217/2/cqg6_13_009.pdf
• http://arxiv.org/PS_cache/gr-qc/pdf/99 10/99 10047v2.pdf
• http://arxiv.org/PS_cache/gr-qc/pdf/98 10/98 10048v4.pdf
• http://arxiv.org/PS_cache/hep-th/pdf/0608/0608221v2.pdf
• http://arxiv.org/PS_cache/hep-th/pdf/0608/0608226v2.pdf
• Varadarajan, V. S. (2004). Super symmetry for Mathematicians: An Introduction. Courant Lecture Notes in
Mathematics 11. American Mathematical Society. ISBN 0-8218-3574-2.
• Deligne, Pierre (1999). Quantum Fields and Strings: A Course for Mathematicians. 1. American Mathematical
Society. ISBN 0-8218-2012-5.
• Deligne, Pierre (1999). Quantum Fields and Strings: A Course for Mathematicians. 2. American Mathematical
Society. ISBN 0-8218-2012-5.
Rigged Hilbert space
In mathematics, a rigged Hilbert space (Gelfand triple, nested Hilbert space, equipped Hilbert space) is a
construction designed to link the distribution and square-integrable aspects of functional analysis. Such spaces were
introduced to study spectral theory in the broad sense. They can bring together the 'bound state' (eigenvector) and
'continuous spectrum', in one place.
Motivation
Since a function such as
x h-> e ix >
which is in an obvious sense an eigenvector of the differential operator
. d
dx
on the real line R, is not square-integrable for the usual Borel measure on R, this requires some way of stepping
outside the strict confines of the Hilbert space theory. This was supplied by the apparatus of Schwartz distributions,
and a generalized eigenfunction theory was developed in the years after 1950.
Functional analysis approach
The concept of rigged Hilbert space places this idea in abstract functional- analytic framework. Formally, a rigged
Hilbert space consists of a Hilbert space H, together with a subspace O which carries a finer topology, that is one for
which the natural inclusion
$ C H
is continuous. It is no loss to assume that O is dense in H for the Hilbert norm. We consider the inclusion of dual
spaces H in O . The latter, dual to O in its 'test function' topology, is realised as a space of distributions or
generalised functions of some sort, and the linear functionals on the subspace O of type
<t> ^ <V 3 0)
for v in H are faithfully represented as distributions (because we assume O dense).
Rigged Hilbert space
42
Now by applying the Riesz representation theorem we can identify H with H. Therefore the definition of rigged
Hilbert space is in terms of a sandwich:
$ C H C ©*.
The most significant examples are for which O is a nuclear space; this comment is an abstract expression of the idea
that O consists of test functions and O* of the corresponding distributions.
Formal definition (Gelfand triple)
A rigged Hilbert space is a pair (H,0) with H a Hilbert space, O a dense subspace, such that O is given a
topological vector space structure for which the inclusion map i is continuous. Identifying H with its dual space H* 9
the adjoint to i is the map i* \ H = H* — > $* • The duality pairing between O and O has to be compatible with
the inner product on H. (u,v)<f> x $>* = (it, v) H whenever u G <$> C if and y £ H = H* C <£* •
The specific triple i/ ; $*} is often named the "Gelfand triple" (after the mathematician Israel Gelfand).
Note that even though O is isomorphic to O if O is a Hilbert space in its own right, this isomorphism is not the same
as the composition of the inclusion / with its adjoint /*
References
• J.-P. Antoine, Quantum Mechanics Beyond Hilbert Space (1996), appearing in Irreversibility and Causality,
Semigroups and Rigged Hilbert Spaces, Arno Bohm, Heinz-Dietrich Doebner, Piotr Kielanowski, eds.,
Springer- Verlag, ISBN 3-540-64305-2. (Provides a survey overview.)
• Jean Dieudonne, Elements d 'analyse VII (1978). (See paragraphs 23.8 and 23.32)
• I. M. Gelfand and N. J. Vilenkin. Generalized Functions, vol. 4: Some Applications of Harmonic Analysis.
Rigged Hilbert Spaces. Academic Press, New York, 1964.
• R. de la Madrid, "The role of the rigged Hilbert space in Quantum Mechanics," Eur. J. Phys. 26, 287 (2005);
quant-ph/0502053
• K. Maurin, Generalized Eigenfunction Expansions and Unitary Representations of Topological Groups, Polish
Scientific Publishers, Warsaw, 1968.
• Minlos, R.A. (2001), "Rigged Hilbert space" , in Hazewinkel, Michiel, Encyclopaedia of Mathematics,
Springer, ISBN 978-1556080104
References
[ 1 ] http://arxiv.org/abs/quant-ph/050205 3
[2] http://eom. springer. de/r/r082340.htm
Quantum inverse scattering method
43
Quantum inverse scattering method
Quantum inverse scattering method relates two different approaches: l)Inverse scattering transform is a method of
solving classical integrable differential equations of evolutionary type. Important concept is Lax representation. 2)
Bethe ansatz is a method of solving quantum models in one space and one time dimension. Quantum inverse
scattering method starts by quantization of Lax representation and reproduce results of Bethe ansatz. Actually it
permits to rewrite Bethe ansatz in a new form: algebraic Bethe ansatz. This led to further progress in understanding
of Heisenberg model (quantum), quantum Nonlinear Schrodinger equation and Hubbard model.
In mathematics, the quantum inverse scattering method is a method for solving integrable models in 1+1
dimensions introduced by L. D. Faddeev in about 1979.
References
• Faddeev, L. (1995), "Instructive history of the quantum inverse scattering method" Acta Applicandae
Mathematical 39 (1): 69-84, MR1329554, ISSN 0167-8019
• Korepin, V. E.; Bogoliubov, N. M.; Izergin, A. G. (1993), Quantum inverse scattering method and correlation
T21
functions \ Cambridge Monographs on Mathematical Physics, Cambridge University Press, MR1245942,
ISBN 978-0-521-37320-3
References
[1] http://dx.doi.org/10. 1007/BF00994626
[2] http://www. Cambridge. org/catalogue/catalogue.asp?isbn=978052 1586467
A quasi-Hopf algebra is a generalization of a Hopf algebra, which was defined by the Russian mathematician
Vladimir Drinfeld in 1989.
Quasi-Hopf algebra
A quasi-Hopf algebra is a quasi-bialgebra Bj± = (w4.
antihomomorphism S (antipode) of J[ such that
., A, £, <I>) for which there exist a, (3 £ A. and a bijective
for all a £ A. an d where
and
Y / S(P j )aQ j /3S(R j )=I.
3
where the expansions for the quantities $and c£ -1 are given by
and
Quasi-Hopf algebra
44
* _1 = E^®Qi®^
3
As for a quasi-bialgebra, the property of being quasi-Hopf is preserved under twisting.
Usage
Quasi-Hopf algebras form the basis of the study of Drinfeld twists and the representations in terms of F-matrices
associated with finite-dimensional irreducible representations of quantum affine algebra. F-matrices can be used to
factorize the corresponding R-matrix. This leads to applications in Statistical mechanics, as quantum affine algebras,
and their representations give rise to solutions of the Yang-Baxter equation, a solvability condition for various
statistical models, allowing characteristics of the model to be deduced from its corresponding quantum affine
algebra. The study of F-matrices has been applied to models such as the Heisenberg XXZ model in the framework of
the algebraic Bethe ansatz. It provides a framework for solving two-dimensional integrable models by using the
Quantum inverse scattering method.
References
• Vladimir Drinfeld, Quasi-Hopf algebras, Leningrad Math J. 1 (1989), 1419-1457
• J.M. Maillet and J. Sanchez de Santos, Drinfeld Twists and Algebraic Bethe Ansatz, Amer. Math. Soc. Transl. (2)
Vol. 201, 2000
Quasitriangular Hopf algebra
In mathematics, a Hopf algebra, H, is quasitriangular^ if there exists an invertible element, R, of JJ ® H such
that
• R A(x) = (To R for all x £ H » where A is the coproduct on H, and the linear map
T \H®H -> H®Hi& given by T(x ® y) = y <g> x ,
• (A ® l)(i?) = R 13 R 23 ,
• (1® A){R) = R 13 R 12 ,
where R l2 = <p 12 (R) , R 13 = <t>i 3 {R) , and R 23 = <j) 23 (R) , where 0 12 : H <g> H -> H <g> H ® H ,
013 :H®H^H(&H®H, and 023 : H ® H — > H ® H <E> H , are algebra morphisms determined
by
(fruia ® 6) = a ® b ® 1,
0i 3 (a ® b) = a <g> 1 ® 6,
023(a ® ft) = 1 ® a <8)
is called the R-matrix.
As a consequence of the properties of quasitriangularity, the R-matrix, R, is a solution of the Yang-Baxter equation
(and so a module V of H can be used to determine quasi-invariants of braids, knots and links). Also as a consequence
of the properties of quasitriangularity, (e ® = (1 <g> = 1 G ; moreover J? -1 = (£ ® 1)(J2),
fl = (1 ® S^R- 1 ), and (5 ® = R . One may further show that the antipode S must be a linear
isomorphism, and thus S A 2 is an automorphism. In fact, S A 2 is given by conjugating by an invertible element:
S(x) = uxu -1 where u = m(S f <g> l)i? 21 (cf. Ribbon Hopf algebras).
It is possible to construct a quasitriangular Hopf algebra from a Hopf algebra and its dual, using the Drinfel'd
quantum double construction.
Quasitriangular Hopf algebra
45
Twisting
The property of being a quasi-triangular Hopf algebra is preserved by twisting via an invertible element
F = ^1 / ® fi ^ A® A mat ^ e (g) id)p — ^oJ (g) e) F = land satisfying the cocycle condition
i
(F ® 1) o (A ® = (1 ® F) o (id <g> A)F
Furthermore, " = is invertible and the twisted antipode is given by S'^a) = uS{a)u 1 , with the
i
twisted comultiplication, R-matrix and co-unit change according to those defined for the quasi-triangular Quasi-Hopf
algebra. Such a twist is known as an admissible (or Drinfel'd) twist.
Notes
[1] Montgomery & Schneider (2002), p. 72 (http://books.google.com/books?id=I3IK9U5Co_0C&pg=PA72&dq=''Quasitriangular'').
References
• Susan Montgomery, Hans-Jurgen Schneider. New directions in Hopf algebras , Volume 43. Cambridge University
Press, 2002. ISBN 9780521815123
Ribbon Hopf algebra
A ribbon Hopf algebra (^4 ? m, A, it, £, S : 7Z : i/)is a quasitriangular Hopf algebra which possess an invertible
central element v more commonly known as the ribbon element, such that the following conditions hold:
v 2 = uS{u), S(u) = v, e{v) = 1
a(i/) = {n 21 n 12 )-\v®v)
where u = m(S ® id) (7^21 )• Note that the element u exists for any quasitriangular Hopf algebra, and uS(u)
must always be central and satisfies
S(uS(u)) = uS(u),e(uS(u)) = l,A(uS(u)) = (n 2l n l2 )~ 2 (uS(u) ® uS(u))> so that all that is
required is that it have a central square root with the above properties.
Here
A is a vector space
m is the multiplication map 777, ; A ® A — > A
A is the co-product map A : A — > A ® A
u is the unit operator u : C — > A
£ is the co-unit operator £ ; ^4 — > C
5 is the antipode S : A ^ A
is a universal R matrix
We assume that the underlying field K is C
Ribbon Hopf algebra
46
See also
• Quasitriangular Hopf algebra
• Quasi-triangular Quasi-Hopf algebra
References
• Altschuler, D., Coste, A.: Quasi-quantum groups, knots, three-manifolds and topological field theory. Commun.
Math. Phys. 150 1992 83-107 http://arxiv.org/pdf/hep-th/9202047
• Chari, V.C., Pressley, A.: A Guide to Quantum Groups Cambridge University Press, 1994 ISBN 0-521-55884-0.
• Vladimir Drinfeld, Quasi-Hopf algebras, Leningrad Math J. 1 (1989), 1419-1457
• Shahn Majid : Foundations of Quantum Group Theory Cambridge University Press, 1995
Quasi-triangular Quasi-Hopf algebra
A quasi-triangular quasi-Hopf algebra is a specialized form of a quasi-Hopf algebra defined by the Ukrainian
mathematician Vladimir Drinfeld in 1989. It is also a generalized form of a quasi-triangular Hopf algebra.
A quasi-triangular quasi-Hopf algebra is a set 7Y^4 = (*4. 5 i?, A, £, where Bj± = (*4 5 A, £, <1>) is a
quasi-Hopf algebra and R £ j\ g) known as the R-matrix, is an invertible element such that
RA(a) = a o A(a)R, a <E A
a\A®A^A®A
x ® y — > y <g> x
so that a is the switch map and
(A g> id)R = $32ltfl3*r32^23$123
(id ® A)R = $^3 1 1 i?i3$213^12*r2 1 3
where $ abc = x a <g> x b <g> x c and $ 123 = Q = Xi ® x 2 ® x$ £ A® A® A .
The quasi-Hopf algebra becomes triangular if in addition, i? 2 i R\2 — 1 •
The twisting of l~Lj^ by F £ ^4 ® A i s me same as for a quasi-Hopf algebra, with the additional definition of the
twisted ^-matrix
A quasi-triangular (resp. triangular) quasi-Hopf algebra with $ = lis a quasi-triangular (resp. triangular) Hopf
algebra as the latter two conditions in the definition reduce the conditions of quasi-triangularity of a Hopf algebra .
Similarly to the twisting properties of the quasi-Hopf algebra, the property of being quasi-triangular or triangular
quasi-Hopf algebra is preserved by twisting.
See also
• Quasitriangular Hopf algebra
• Ribbon Hopf algebra
References
• Vladimir Drinfeld, Quasi-Hopf algebras, Leningrad Math J. 1 (1989), 1419-1457
• J.M. Maillet and J. Sanchez de Santos, Drinfeld Twists and Algebraic Bethe Ansatz, Amer. Math. Soc. Transl. (2)
Vol. 201, 2000
Grassmann algebra
47
Grassmann algebra
In mathematics, the exterior product or wedge product of vectors is an algebraic construction generalizing certain
features of the cross product to higher dimensions. Like the cross product, and the scalar triple product, the exterior
product of vectors is used in Euclidean geometry to study areas, volumes, and their higher-dimensional analogs.
Also, like the cross product, the exterior product is alternating, meaning that u a u = 0 for all vectors u, or
equivalently^ u a v = -v a u for all vectors u and v. In linear algebra, the exterior product provides an abstract
algebraic manner for describing the determinant and the minors of a linear transformation that is basis-independent,
and is fundamentally related to ideas of rank and linear independence.
The exterior algebra (also known as the Grassmann algebra, after Hermann Grassmann ) of a given vector
space V over a field K is the unital associative algebra A(V) generated by the exterior product. It is widely used in
contemporary geometry, especially differential geometry and algebraic geometry through the algebra of differential
forms, as well as in multilinear algebra and related fields. In terms of category theory, the exterior algebra is a type
of functor on vector spaces, given by a universal construction. The universal construction allows the exterior algebra
to be defined, not just for vector spaces over a field, but also for modules over a commutative ring, and for other
structures of interest. The exterior algebra is one example of a bialgebra, meaning that its dual space also possesses a
product, and this dual product is compatible with the wedge product. This dual algebra is precisely the algebra of
alternating multilinear forms on V, and the pairing between the exterior algebra and its dual is given by the interior
product.
Motivating examples
Areas in the plane
2
The Cartesian plane R is a vector space equipped with a basis
consisting of a pair of unit vectors
(a+c,b+d)
The area of a parallelogram in terms of the
determinant of the matrix of coordinates of two of
its vertices.
ex = (1,0), e 2 = (0,l).
Suppose that
v = viei + V2&2>> w = wi&i + w 2 e 2
2
are a pair of given vectors in R , written in components. There is a unique parallelogram having v and w as two of its
sides. The area of this parallelogram is given by the standard determinant formula:
Grassmann algebra
48
A = |det [v w] | = \v1W2 — v 2 w 1 \.
Consider now the exterior product of v and w:
v A w = (uiei + v 2 e 2 ) A (u>iei + w 2 e 2 )
= viwiei A ei + viw 2 ei A e 2 + v 2 wie 2 A ei + v 2 w 2 e 2 A e 2
= (v 1 w 2 — ^2^1)^1 A e 2
where the first step uses the distributive law for the wedge product, and the last uses the fact that the wedge product
is alternating, and in particular e 2 a e 1 = -e 1 a e^. Note that the coefficient in this last expression is precisely the
determinant of the matrix [v w]. The fact that this may be positive or negative has the intuitive meaning that v and w
may be oriented in a counterclockwise or clockwise sense as the vertices of the parallelogram they define. Such an
area is called the signed area of the parallelogram: the absolute value of the signed area is the ordinary area, and the
sign determines its orientation.
The fact that this coefficient is the signed area is not an accident. In fact, it is relatively easy to see that the exterior
product should be related to the signed area if one tries to axiomatize this area as an algebraic construct. In detail, if
A(v,w) denotes the signed area of the parallelogram determined by the pair of vectors v and w, then A must satisfy
the following properties:
1. A(a\,bw) = ab A(v,w) for any real numbers a and b, since rescaling either of the sides rescales the area by the
same amount (and reversing the direction of one of the sides reverses the orientation of the parallelogram).
2. A(v,v) = 0, since the area of the degenerate parallelogram determined by v (i.e., a line segment) is zero.
3. A(w,v) = -A(v,w), since interchanging the roles of v and w reverses the orientation of the parallelogram.
4. A(v + <zw,w) = A(v,w), since adding a multiple of w to v affects neither the base nor the height of the
parallelogram and consequently preserves its area.
5. A(e , e ) = 1, since the area of the unit square is one.
With the exception of the last property, the wedge product satisfies the same formal properties as the area. In a
certain sense, the wedge product generalizes the final property by allowing the area of a parallelogram to be
compared to that of any "standard" chosen parallelogram. In other words, the exterior product in two-dimensions is a
T31
basis-independent formulation of area.
Cross and triple products
For vectors in R , the exterior algebra is closely related to the cross product and triple product. Using the standard
basis {e 1? e 2 , e 3 }, the wedge product of a pair of vectors
u = itiei + u 2 e 2 + u 3 e 3
and
v = vi&i + v 2 e 2 + v 3 e 3
is
u A v = {uiv 2 - u 2 v 1 )(e 1 A e 2 ) + (u 3 v 1 - uiv 3 ) (e 3 A + (u 2 v 3 - u 3 v 2 )(e 2 A e 3 )
2 3
where {e 1 A e 2 , e 3 A e 2 A e 3 } is the basis for the three-dimensional space A (R ). This imitates the usual
definition of the cross product of vectors in three dimensions.
Bringing in a third vector
w = W1B1 + w 2 e 2 + w 3 e 3 ,
the wedge product of three vectors is
uAvAw = (uiv 2 w 3 +u 2 v 3 wi+u 3 viw 2 — uiv 3 w 2 — u 2 viw 3 —u 3 v 2 wi)(eiAe 2 Ae 3 )
3 3
where e 1 A e 2 A e^ is the basis vector for the one-dimensional space A (R ). This imitates the usual definition of the
triple product.
Grassmann algebra
49
The cross product and triple product in three dimensions each admit both geometric and algebraic interpretations.
The cross product uxv can be interpreted as a vector which is perpendicular to both u and v and whose magnitude is
equal to the area of the parallelogram determined by the two vectors. It can also be interpreted as the vector
consisting of the minors of the matrix with columns u and v. The triple product of u, v, and w is geometrically a
(signed) volume. Algebraically, it is the determinant of the matrix with columns u, v, and w. The exterior product in
three-dimensions allows for similar interpretations. In fact, in the presence of a positively oriented orthonormal
basis, the exterior product generalizes these notions to higher dimensions.
Formal definitions and algebraic properties
The exterior algebra A(V) over a vector space V is defined as the quotient algebra of the tensor algebra by the
two-sided ideal / generated by all elements of the form x <E> x such that x G V.^ Symbolically,
A(V) := T(V)/I.
The wedge product a of two elements of A(V) is defined by
a A f3 = a® {3 (mod/).
Anticommutativity of the wedge product
The wedge product is alternating on elements of V, which means that x a x = 0 for all x G V. It follows that the
product is also anticommutative on elements of V, for supposing that xjEV,
0 = (x + y)A(x + y) = xAx + xAy + yAx + yAy = xAy + yAx
whence
x Ay = —y A x.
More generally, ifx^x^ x^ are elements of V, and a is a permutation of the integers [1, ...,£], then
2^(1) A x a( 2) A • • • A x a{k) = sgn(cr)x 1 A x 2 A • • • A x k ,
where sgn(a) is the signature of the permutation oP^
The exterior power
k
The Mi exterior power of V, denoted A (V), is the vector subspace of A(V) spanned by elements of the form
A X2 A . . . A Xfc, Xi € V, i = 1, 2, . . . , fc.
k
If a G A (V), then a is said to be a &-multi vector. If, furthermore, a can be expressed as a wedge product of k
k
elements of V, then a is said to be decomposable. Although decomposable multivectors span A (V), not every
k 4
element of A (V) is decomposable. For example, in R , the following 2-multivector is not decomposable:
a. = e\ A e 2 + e 3 A e 4 .
(This is in fact a symplectic form, since a a a ^ 0 [6] )
Grassmann algebra
50
Basis and dimension
If the dimension of Vis n and {e ,...,e } is a basis of V, then the set
{e^ A e i2 A • • • A e ik | 1 < i\ < i 2 < - - ■ < t& < n}
k
is a basis for A (V). The reason is the following: given any wedge product of the form
Vi A - - - A Vk
then every vector v . can be written as a linear combination of the basis vectors e .; using the bilinearity of the wedge
J 1
product, this can be expanded to a linear combination of wedge products of those basis vectors. Any wedge product
in which the same basis vector appears more than once is zero; any wedge product in which the basis vectors do not
appear in the proper order can be reordered, changing the sign whenever two basis vectors change places. In general,
the resulting coefficients of the basis vectors can be computed as the minors of the matrix that describes the vectors
v. in terms of the basis e..
J i
By counting the basis elements, the dimension of A (V) is the binomial coefficient C(n,k). In particular, A (V) = {0}
for k > n.
Any element of the exterior algebra can be written as a sum of multi vectors. Hence, as a vector space the exterior
algebra is a direct sum
A(V) = A°{V) © A}(V) © A 2 {V) © • • • © A n (V)
(where by convention A°(V) = K and A l (V) = V), and therefore its dimension is equal to the sum of the binomial
coefficients, which is 2 n .
Rank of a multivector
If a G A k (V) 9 then it is possible to express a as a linear combination of decomposable multivectors:
a = a m + a P) + ... + a (s)
where each is decomposable, say
a W =af ) A-Aaf, 1 = 1,2,...,*.
The rank of the multivector a is the minimal number of decomposable multivectors in such an expansion of a. This
is similar to the notion of tensor rank.
Rank is particularly important in the study of 2-multivectors (Sternberg 1974, §111.6) (Bryant et al. 1991). The rank
of a 2-multi vector a can be identified with half the rank of the matrix of coefficients of a in a basis. Thus if e. is a
i
basis for V, then a can be expressed uniquely as
a = ^ a ij e i A e j
where a.. - -a., (the matrix of coefficients is skew- symmetric). The rank of the matrix a., is therefore even, and is
ij Ji y
twice the rank of the form a.
In characteristic 0, the 2-multivector a has rank p if and only if
a A — ■ A a ^ 0
v
and
a A — ■ A a = 0.
Grassmann algebra
51
Graded structure
The wedge product of a &-multi vector with a /?-multi vector is a (&+/?)-multi vector, once again invoking bilinearity.
As a consequence, the direct sum decomposition of the preceding section
A(V) = A°(V) © A^V) © A 2 (V) © • • • © A n (V)
gives the exterior algebra the additional structure of a graded algebra. Symbolically,
(A k (V)) A {A P {V)) C A k+p (V).
Moreover, the wedge product is graded anticommutative, meaning that if a G A k (V) and (3 G A P (V), then
qA/? = (-1)^Aq.
In addition to studying the graded structure on the exterior algebra, Bourbaki (1989) studies additional graded
structures on exterior algebras, such as those on the exterior algebra of a graded module (a module that already
carries its own gradation).
Universal property
Let V be a vector space over the field K. Informally, multiplication in A(V) is performed by manipulating symbols
and imposing a distributive law, an associative law, and using the identity v a v = 0 for v E V. Formally, A(V) is the
"most general" algebra in which these rules hold for the multiplication, in the sense that any unital associative
^-algebra containing V with alternating multiplication on V must contain a homomorphic image of A(V). In other
words, the exterior algebra has the following universal property:
Given any unital associative 7^-algebra A and any TT-linear map j : V —> A such that j(v)j(v) = 0 for every v in V, then
there exists precisely one unital algebra homomorphism/ : A(V) —> A such that/(v) = j(i(v)) for all v in V.
V — A(V)
: /
+
To construct the most general algebra that contains V and whose multiplication is alternating on V, it is natural to
start with the most general algebra that contains V, the tensor algebra T(V), and then enforce the alternating property
by taking a suitable quotient. We thus take the two-sided ideal / in T(V) generated by all elements of the form v®v
for v in V, and define A(V) as the quotient
A(V)=T(V)/I
(and use A as the symbol for multiplication in A(V)). It is then straightforward to show that A(V) contains V and
satisfies the above universal property.
As a consequence of this construction, the operation of assigning to a vector space V its exterior algebra A(V) is a
functor from the category of vector spaces to the category of algebras.
Rather than defining A(V) first and then identifying the exterior powers A k (V) as certain subspaces, one may
alternatively define the spaces A k (V) first and then combine them to form the algebra A(V). This approach is often
used in differential geometry and is described in the next section.
Grassmann algebra
52
Generalizations
Given a commutative ring R and an /^-module M, we can define the exterior algebra A(M) just as above, as a suitable
quotient of the tensor algebra T(M). It will satisfy the analogous universal property. Many of the properties of A(M)
also require that M be a projective module. Where finite-dimensionality is used, the properties further require that M
be finitely generated and projective. Generalizations to the most common situations can be found in (Bourbaki
1989).
Exterior algebras of vector bundles are frequently considered in geometry and topology. There are no essential
differences between the algebraic properties of the exterior algebra of finite-dimensional vector bundles and those of
the exterior algebra of finitely-generated projective modules, by the Serre-Swan theorem. More general exterior
algebras can be defined for sheaves of modules.
Duality
Alternating operators
Given two vector spaces V and X, an alternating operator (or anti-symmetric operator) from to X is a multilinear
map
/ : V k -> X
such that whenever v 1? ...,v^ are linearly dependent vectors in V, then
f(v u ...,v k ) = 0
A well-known example is the determinant, an alternating operator from (f^) n to K.
The map
w : V k -> A k (V)
which associates to k vectors from V their wedge product, i.e. their corresponding ^-vector, is also alternating. In
fact, this map is the "most general" alternating operator defined on V k : given any other alternating operator/: —>
X, there exists a unique linear map cp: A k (V) —> X with/= cp o w. This universal property characterizes the space
k
A (V) and can serve as its definition.
Alternating multilinear forms
The above discussion specializes to the case when X = K, the base field. In this case an alternating multilinear
function
/ : V k -> K
is called an alternating multilinear form. The set of all alternating multilinear forms is a vector space, as the sum
of two such maps, or the product of such a map with a scalar, is again alternating. By the universal property of the
exterior power, the space of alternating forms of degree k on V is naturally isomorphic with the dual vector space
(A k V)*. If V is finite-dimensional, then the latter is naturally isomorphic to A k (V*). In particular, the dimension of
the space of anti- symmetric maps from to K is the binomial coefficient n choose k.
Under this identification, the wedge product takes a concrete form: it produces a new anti-symmetric map from two
given ones. Suppose oo : V* -> K and x\ : V 71 —> K are two anti- symmetric maps. As in the case of tensor products of
multilinear maps, the number of variables of their wedge product is the sum of the numbers of their variables. It is
defined as follows:
(k + m)\ 41 ,
uj A r) = ^— — -^-Alt(u; <g> rj)
k\ ml
where the alternation Alt of a multilinear map is defined to be the signed average of the values over all the
permutations of its variables:
Grassmann algebra
53
Alt(o;)(xi 5 . . . , x k ) = — ^ *g*(°) "fcr(i), ■ ■ ■ ,
This definition of the wedge product is well-defined even if the field K has finite characteristic, if one considers an
equivalent version of the above that does not use factorials or any constants:
u)Ar)(x u . . .,x fc+m ) = ^2 BP^Jw^i), . . . ,a; ff (jfc))i/(i ff ( H i), . . . , x 0 .( jfc+m )) )
<reSh kjTn
where here 57* C 5 is the subset of (k,m) shuffles: permutations a of the set {1,2,..., k+m] such that o(l) < a(2)
K,tTl Kitn ron
< . . . < a(Jfc), and a(ife+l) < a(&+2)< . . . <a(Jfc+m). L J
Bialgebra structure
In formal terms, there is a correspondence between the graded dual of the graded algebra A(V) and alternating
multilinear forms on V. The wedge product of multilinear forms defined above is dual to a coproduct defined on
A(V), giving the structure of a coalgebra.
The coproduct is a linear function A : A(V) — » A(V) ® A(V) given on decomposable elements by
k
A(x 1 A - ■ -Aijt) = ^ S s enW(^(i) A- ■ ■Ai ff ( p ))®(i ff ( p+ i) A - • -Ax^).
For example,
A(a?i) = 1 ® X! + Xi ® 1,
A(xi A X2) = 1 ® (xi A X2) + xi (8) ^2 — X2 ® xi + (xi A X2) ® 1.
This extends by linearity to an operation defined on the whole exterior algebra. In terms of the coproduct, the wedge
product on the dual space is just the graded dual of the coproduct:
(a A /?)(x! A . . . A x k ) = (a ® j3) {A(x 1 A ... A x fc ))
where the tensor product on the right-hand side is of multilinear linear maps (extended by zero on elements of
incompatible homogeneous degree: more precisely, cxa|3 = 8 o (a®|3) o A, where 8 is the counit, as defined
presently).
The counit is the homomorphism 8 : A(V) —> K which returns the 0-graded component of its argument. The
coproduct and counit, along with the wedge product, define the structure of a bialgebra on the exterior algebra.
With an antipode defined on homogeneous elements by S(x) = (-l) deg x x, the exterior algebra is furthermore a Hopf
algebra. 1
Interior product
Suppose that V is finite-dimensional. If V* denotes the dual space to the vector space V, then for each a E V , it is
possible to define an antiderivation on the algebra A(V),
i a : A k V -> A k ~ l V.
This derivation is called the interior product with a, or sometimes the insertion operator, or contraction by a.
Suppose that w G A k V. Then w is a multilinear mapping of V* to K, so it is defined by its values on the &-fold
Cartesian product V x V x ... x V . If u^, u^, are k-1 elements of V , then define
{i a w)(u u u 2 ... , Ufe-i) = w(a, u u u 2 , . - - 5 Ufc-i).
Additionally, let ij= 0 whenever /is a pure scalar (i.e., belonging to A°V).
Grassmann algebra
54
Axiomatic characterization and properties
The interior product satisfies the following properties:
1 . For each k and each a E V ,
i a : A h V -> A^V.
(By convention, A -1 = 0.)
2. If v is an element of V ( = A 1 V), then i v = a(v) is the dual pairing between elements of V and elements of V*.
3. For each a £ V , is a graded derivation of degree -1:
i a (a A b) = (i a a) Ab+ (-l) dega a A (i a b).
In fact, these three properties are sufficient to characterize the interior product as well as define it in the general
infinite-dimensional case.
Further properties of the interior product include:
• i a o i a = 0.
• hy. ° ip — —ip 0 in-
Hodge duality
Suppose that V has finite dimension n. Then the interior product induces a canonical isomorphism of vector spaces
A k (V*) ® A n (V) -> A n - k (V).
In the geometrical setting, a non-zero element of the top exterior power A n (V) (which is a one-dimensional vector
space) is sometimes called a volume form (or orientation form, although this term may sometimes lead to
ambiguity.) Relative to a given volume form a, the isomorphism is given explicitly by
If, in addition to a volume form, the vector space V is equipped with an inner product identifying V with V , then the
resulting isomorphism is called the Hodge dual (or more commonly the Hodge star operator)
* : A k (V) -> A n - k (V).
k k
The composite of * with itself maps A (V) — » A (V) and is always a scalar multiple of the identity map. In most
applications, the volume form is compatible with the inner product in the sense that it is a wedge product of an
orthonormal basis of V. In this case,
* o * : A k (V) -> A k (V) =
where / is the identity, and the inner product has metric signature (p,q) — p plusses and q minuses.
Inner product
For V a finite-dimensional space, an inner product on V defines an isomorphism of V with V*, and so also an
isomorphism of A^V with (A k V)*. The pairing between these two spaces also takes the form of an inner product. On
decomposable &-multi vectors,
(vi A - - - A VfaWi A • • • A w k ) = det((^ 5 u^-)),
the determinant of the matrix of inner products. In the special case v. = w., the inner product is the square norm of the
multivector, given by the determinant of the Gramian matrix (bv„ v.D). This is then extended bilinearly (or
1 J k
sesquilinearly in the complex case) to a non-degenerate inner product on A V. If e , /=l,2,...,/2, form an orthonormal
basis of V, then the vectors of the form
e h A - Ae ik: i x < • • • < i k ,
k
constitute an orthonormal basis for A (V).
Grassmann algebra
55
With respect to the inner product, exterior multiplication and the interior product are mutually adjoint. Specifically,
for v G A k ~ l (V), w G A*(V), and iEV,
(x Av,w) = (v, vw>
where x G V is the linear functional defined by
x\y) = (x,y)
for all y G V. This property completely characterizes the inner product on the exterior algebra.
Functoriality
Suppose that V and W are a pair of vector spaces and / : V —> W is a linear transformation. Then, by the universal
construction, there exists a unique homomorphism of graded algebras
A(/) : A(V) -> A(W)
such that
A(/)| A i ( v) =f:V = A\V) ^W = A\W).
In particular, A(f) preserves homogeneous degree. The ^-graded components of A(f) are given on decomposable
elements by
A(/)(x! A ... A Xk) = fix,) A ... A f(x k ).
Let
A fc (/) = A(f) AHv) : A k (V) -» A*(W).
The components of the transformation A(&) relative to a basis of V and W is the matrix of k x k minors of /. In
particular, if V = W and V is of finite dimension n, then A n (f) is a mapping of a one-dimensional vector space A n to
itself, and is therefore given by a scalar: the determinant off.
Exactness
If
is a short exact sequence of vector spaces, then
0 -> A 1 (17) A A(f) -> A(V) -> A(W) -> 0
is an exact sequence of graded vector spaces tl0] as is
o->A(to ^A(y). [11]
Direct sums
In particular, the exterior algebra of a direct sum is isomorphic to the tensor product of the exterior algebras:
A(V © W) = A(V) ® A(W).
This is a graded isomorphism; i.e.,
A fc (y©W)= 0 A p (V) © A 9 (W0.
Slightly more generally, if
is a short exact sequence of vector spaces then A k ( V) has a filtration
0 = F° C F 1 C ■ ■ ■ C F k C = A fc (V)
with quotients : F"P+ l / F p = A fc_p (/7) <g> A P (W) • In particular, if U is 1-dimensional then
Grassmann algebra
56
0 -> U <g> A*" 1 ^) -> A*(V) -> A fc (W) -> 0
is exact, and if Wis 1 -dimensional then
0 -> A*(tf) -> A*(V) -> A fe_1 (i7)
is exact J 1 2 ^
The alternating tensor algebra
ri3i
If K is a field of characteristic 0, L then the exterior algebra of a vector space V can be canonically identified with
the vector subspace of T(V) consisting of antisymmetric tensors. Recall that the exterior algebra is the quotient of
T( V) by the ideal / generated by x ® x.
Let T r ( V) be the space of homogeneous tensors of degree r. This is spanned by decomposable tensors
Vi ® . . . (g> IV, u» G V.
The antisymmetrization (or sometimes the skew-symmetrization) of a decomposable tensor is defined by
1 X
Alt^! (g) - ■ ■ (g) ty) — — sgn(cr)i; CT (i) (g> • • • <g) v^ T )
where the sum is taken over the symmetric group of permutations on the symbols { l,...,r}. This extends by linearity
and homogeneity to an operation, also denoted by Alt, on the full tensor algebra T(V). The image Alt(T(V)) is the
alternating tensor algebra, denoted A(V). This is a vector subspace of T(V), and it inherits the structure of a graded
vector space from that on T(V). It carries an associative graded product § defined by
i®s = Alt(*<g> s).
Although this product differs from the tensor product, the kernel of Alt is precisely the ideal / (again, assuming that K
has characteristic 0), and there is a canonical isomorphism
A(V) = A(V).
Index notation
Suppose that V has finite dimension n, and that a basis e^ e^ of Vis given, then any alternating tensor t G A r (V) C
T^V) can be written in index notation as
t = t ili2 - ir e h (g> e i2 <g) » • • <g) e ir
where t l \ "' V is completely antisymmetric in its indices.
The wedge product of two alternating tensors t and s of ranks r and p is given by
t®s = - — ^— ^ sgn(cr)f-( 1 )-"^W5 iCT ^ 1 ) - iCT ^)e il ® Bi a ® - - - <g) e ir
The components of this tensor are precisely the skew part of the components of the tensor product s ® t, denoted by
square brackets on the indices:
The interior product may also be described in index notation as follows. Let f — ^*o*i-.*r-ibe an antisymmetric
tensor of rank r. Then, for a G V , i t is an alternating tensor of rank r-1, given by
a
(tot)* 1 -**- 1 =r^a J -* f<1 - ir - 1 .
i=o
where n is the dimension of V.
Grassmann algebra
57
Applications
Linear geometry
The decomposable vectors have geometric interpretations: the bi vector u Av represents the plane spanned by the
vectors, "weighted" with a number, given by the area of the oriented parallelogram with sides u and v. Analogously,
the 3 -vector u A v A w represents the spanned 3 -space weighted by the volume of the oriented parallelepiped with
edges u, v, and w.
Projective geometry
k
Decomposable ^-vectors in A V correspond to weighted ^-dimensional subspaces of V. In particular, the
Grassmannian of ^-dimensional subspaces of V, denoted Gr 'AV), can be naturally identified with an algebraic
k
sub variety of the projective space P(A V). This is called the Pliicker embedding.
Differential geometry
The exterior algebra has notable applications in differential geometry, where it is used to define differential forms. A
differential form at a point of a differentiable manifold is an alternating multilinear form on the tangent space at the
point. Equivalently, a differential form of degree k is a linear functional on the k-th exterior power of the tangent
space. As a consequence, the wedge product of multilinear forms defines a natural wedge product for differential
forms. Differential forms play a major role in diverse areas of differential geometry.
In particular, the exterior derivative gives the exterior algebra of differential forms on a manifold the structure of a
differential algebra. The exterior derivative commutes with pullback along smooth mappings between manifolds, and
it is therefore a natural differential operator. The exterior algebra of differential forms, equipped with the exterior
derivative, is a differential complex whose cohomology is called the de Rham cohomology of the underlying
manifold and plays a vital role in the algebraic topology of differentiable manifolds.
Representation theory
In representation theory, the exterior algebra is one of the two fundamental Schur functors on the category of vector
spaces, the other being the symmetric algebra. Together, these constructions are used to generate the irreducible
representations of the general linear group; see fundamental representation.
Physics
The exterior algebra is an archetypal example of a superalgebra, which plays a fundamental role in physical theories
pertaining to fermions and super symmetry. For a physical discussion, see Grassmann number. For various other
applications of related ideas to physics, see superspace and supergroup (physics).
Lie algebra homology
Let L be a Lie algebra over a field k, then it is possible to define the structure of a chain complex on the exterior
algebra of L. This is a ^-linear mapping
d : A P+1 L -> A P L
defined on decomposable elements by
1
d(x 1 A- • -Axp+i) = — — ^2(-l) 3+£ + 1 [x j ,x £ ]Ax 1 A- • -AxjA- • -Ax^A- • -Ax p+1 .
p + 1 j<£
The Jacobi identity holds if and only if 33 = 0, and so this is a necessary and sufficient condition for an
anticommutative nonassociative algebra L to be a Lie algebra. Moreover, in that case AL is a chain complex with
boundary operator 3. The homology associated to this complex is the Lie algebra homology.
Grassmann algebra
58
Homological algebra
The exterior algebra is the main ingredient in the construction of the Koszul complex, a fundamental object in
homological algebra.
History
The exterior algebra was first introduced by Hermann Grassmann in 1844 under the blanket term of
Ausdehnungslehre, or Theory of Extension} 14 ^ This referred more generally to an algebraic (or axiomatic) theory of
extended quantities and was one of the early precursors to the modern notion of a vector space. Saint- Venant also
published similar ideas of exterior calculus for which he claimed priority over Grassmann J 15 ^
The algebra itself was built from a set of rules, or axioms, capturing the formal aspects of Cay ley and Sylvester's
theory of multi vectors. It was thus a calculus, much like the propositional calculus, except focused exclusively on
the task of formal reasoning in geometrical terms J In particular, this new development allowed for an axiomatic
characterization of dimension, a property that had previously only been examined from the coordinate point of view.
ri7i
The import of this new theory of vectors and multi vectors was lost to mid 19th century mathematicians, until
being thoroughly vetted by Giuseppe Peano in 1888. Peano's work also remained somewhat obscure until the turn of
the century, when the subject was unified by members of the French geometry school (notably Henri Poincare, Elie
Cartan, and Gaston Darboux) who applied Grassmann's ideas to the calculus of differential forms.
A short while later, Alfred North Whitehead, borrowing from the ideas of Peano and Grassmann, introduced his
universal algebra. This then paved the way for the 20th century developments of abstract algebra by placing the
axiomatic notion of an algebraic system on a firm logical footing.
Notes
[I] Provided the characteristic is different from 2.
[2] Grassmann (1844) introduced these as extended algebras (cf. Clifford 1878). He used the word aufiere (literally translated as outer, or
exterior) only to indicate the produkt he defined, which is nowadays conventionally called exterior product, probably to distinguish it from the
outer product as defined in modern linear algebra.
[3] This axiomatization of areas is due to Leopold Kronecker and Karl Weierstrass; see Bourbaki (1989, Historical Note). For a modern
treatment, see MacLane & Birkhoff (1999, Theorem IX.2.2). For an elementary treatment, see Strang (1993, Chapter 5).
[4] This definition is a standard one. See, for instance, MacLane & Birkhoff (1999).
[5] A proof of this can be found in more generality in Bourbaki (1989).
[6] See Sternberg (1964, §111.6).
[7] See Bourbaki (1989, 111.7. 1), and MacLane & Birkhoff (1999, Theorem XVI.6.8). More detail on universal properties in general can be found
in MacLane & Birkhoff (1999, Chapter VI), and throughout the works of Bourbaki.
[8] Some conventions, particularly in physics, define the wedge product as
u) A j) — Alt(u; ® rj).
This convention is not adopted here, but is discussed in connection with alternating tensors.
[9] Indeed, the exterior algebra of V is the enveloping algebra of the abelian Lie superalgebra structure on V.
[10] This part of the statement also holds in greater generality if V and W are modules over a commutative ring: That A converts epimorphisms to
epimorphisms. See Bourbaki (1989, Proposition 3, III.7.2).
[II] This statement generalizes only to the case where V and W are projective modules over a commutative ring. Otherwise, it is generally not the
case that A converts monomorphisms to monomorphisms. See Bourbaki (1989, Corollary to Proposition 12, III.7.9).
[12] Such a filtration also holds for vector bundles, and projective modules over a commutative ring. This is thus more general than the result
quoted above for direct sums, since not every short exact sequence splits in other abelian categories.
[13] See Bourbaki (1989, III.7.5) for generalizations.
[14] Kannenberg (2000) published a translation of Grassmann's work in English; he translated Ausdehnungslehre as Extension Theory.
[15] J Itard, Biography in Dictionary of Scientific Biography (New York 1970-1990).
[16] Authors have in the past referred to this calculus variously as the calculus of extension (Whitehead 1898; Forder 1941), or extensive algebra
(Clifford 1878), and recently as extended vector algebra (Browne 2007).
[17] Bourbaki 1989, p. 661.
Grassmann algebra
59
References
Mathematical references
• Bishop, R.; Goldberg, S.I. (1980), Tensor analysis on manifolds, Dover, ISBN 0-486-64039-6
Includes a treatment of alternating tensors and alternating forms, as well as a detailed discussion of
Hodge duality from the perspective adopted in this article.
• Bourbaki, Nicolas (1989), Elements of mathematics, Algebra I, Springer- Verlag, ISBN 3-540-64243-9
This is the main mathematical reference for the article. It introduces the exterior algebra of a module
over a commutative ring (although this article specializes primarily to the case when the ring is a field),
including a discussion of the universal property, functoriality, duality, and the bialgebra structure. See
chapters III.7 and III. 11.
• Bryant, R.L.; Chern, S.S.; Gardner, R.B.; Goldschmidt, H.L.; Griffiths, P.A. (1991), Exterior differential systems,
Springer- Verlag
This book contains applications of exterior algebras to problems in partial differential equations. Rank
and related concepts are developed in the early chapters.
• MacLane, S.; Birkhoff, G. (1999), Algebra, AMS Chelsea, ISBN 0-8218-1646-2
Chapter XVI sections 6-10 give a more elementary account of the exterior algebra, including duality,
determinants and minors, and alternating forms.
• Sternberg, Shlomo (1964), Lectures on Differential Geometry, Prentice Hall
Contains a classical treatment of the exterior algebra as alternating tensors, and applications to
differential geometry.
Historical references
• Bourbaki, Nicolas (1989), "Historical note on chapters II and III", Elements of mathematics, Algebra I,
Springer- Verlag
• Clifford, W. (1878), "Applications of Grassmann's Extensive Algebra" (http://jstor.org/stable/2369379),
American Journal of Mathematics (The Johns Hopkins University Press) 1 (4): 350-358, doi: 10.2307/2369379
• Forder, H. G. (1941), The Calculus of Extension, Cambridge University Press
• Grassmann, Hermann (1844), Die Lineale Ausdehnungslehre - Ein neuer Zweig der Mathematik (http: //books,
google, com/books ?id=b Kg AAAAAMAAJ&pg=PAl&dq=Die+Lineale+Ausdehnungslehre+ein+neuer+
Zweig+der+Mathematik) (The Linear Extension Theory - A new Branch of Mathematics)
• Kannenberg, Llyod (2000), Extension Theory (translation of Grassmann's Ausdehnungslehre), American
Mathematical Society, ISBN 0821820311
• Peano, Giuseppe (1888), Calcolo Geometrico secondo I Ausdehnungslehre di H. Grassmann preceduto dalle
Operazioni della Logica Deduttiva; Kannenberg, Lloyd (1999), Geometric calculus: according to the
Ausdehnungslehre ofH. Grassmann, Birkhauser, ISBN 978-0817641269.
• Whitehead, Alfred North (1898), A Treatise on Universal Algebra, with Applications (http ://historical. library.
Cornell . edu/cgi-bin/cul . math/doc vie wer ?did=0 1 95 000 1 & seq=5 ) , Cambridge
Grassmann algebra
60
Other references and further reading
• Browne, J.M. (2007), Grassmann algebra - Exploring applications of Extended Vector Algebra with
Mathematica, Published on line (http://www.grassmannalgebra .info/grassmannalgebra/book/index.htm)
An introduction to the exterior algebra, and geometric algebra, with a focus on applications. Also
includes a history section and bibliography.
• Spivak, Michael (1965), Calculus on manifolds, Addison- Wesley, ISBN 978-0805390216
Includes applications of the exterior algebra to differential forms, specifically focused on integration and
Stokes's theorem. The notation A k V in this text is used to mean the space of alternating &-forms on V;
i.e., for Spivak A k V is what this article would call A^V*. Spivak discusses this in Addendum 4.
• Strang, G. (1993), Introduction to linear algebra, Wellesley-Cambridge Press, ISBN 978-0961408855
Includes an elementary treatment of the axiomatization of determinants as signed areas, volumes, and
higher-dimensional volumes.
• Onishchik, A.L. (2001), "Exterior algebra" (http://eom.springer.de/E/e037080.htm), in Hazewinkel, Michiel,
Encyclopaedia of Mathematics, Springer, ISBN 978-1556080104
• Wendell H. Fleming (1965) Functions of Several Variables, Addison- Wesley.
Chapter 6: Exterior algebra and differential calculus, pages 205-38. This textbook in multivariate
calculus introduces the exterior algebra of differential forms adroitly into the calculus sequence for
colleges.
• Winitzki, S. (2010), Linear Algebra via Exterior Products, Published on line (http://sites.google.com/site/
winitzki/linalg)
An introduction to the coordinate-free approach in basic finite-dimensional linear algebra, using exterior
products.
Supergroup
61
Supergroup
The concept of supergroup is a generalization of that of group. In other words, every group is a supergroup but not
every supergroup is a group. A supergroup is like a Lie group in that there is a well defined notion of smooth
function defined on them. However the functions may have even and odd parts. Moreover a supergroup has a super
Lie algebra which plays a role similar to that of a Lie algebra for Lie groups in that they determine most of the
representation theory and which is the starting point for classification.
More formally, a Lie supergroup is a supermanifold G together with a multiplication morphism
fi : G X G — >■ G , an inversion morphism { m Q _> Q and a unit morphism e : 1 — > G which makes G a
group object in the category of supermanifolds. This means that, formulated as commutative diagrams, the usual
associativity and inversion axioms of a group continue to hold. Since every manifold is a super manifold, a Lie
supergroup generalises the notion of a Lie group.
There are many possible supergroups. The ones of most interest in theoretical physics are the ones which extend the
Poincare group or the conformal group. Of particular interest are the orthosymplectic groups Osp(N/M) and the
superconformal groups SU(N/M).
An equivalent algebraic approach starts from the observation that a super manifold is determined by its ring of
supercommutative smooth functions, and that a morphism of super manifolds corresponds one to one with an algebra
homomorphism between their functions in the opposite direction, i.e that the category of supermanifolds is opposite
to the category of algebras of smooth graded commutative functions. Reversing all the arrows in the commutative
diagrams that define a Lie supergroup then shows that functions over the supergroup have the structure of a
Z 2 -graded Hopf algebra. Likewise the representations of this Hopf algebra turn out to be Z 2 -graded comodules. This
Hopf algebra gives the global properties of the supergroup.
There is another related Hopf algebra which is the dual of the previous Hopf algebra. It can be identified with the
Hopf algebra of graded differential operators at the origin. It only gives the local properties of the symmetries i.e., it
only gives information about infinitesimal supersymmetry transformations. The representations of this Hopf algebra
are modules. Like in the non graded case, this Hopf algebra can be described purely algebraically as the universal
enveloping algebra of the Lie superalgebra.
In a similar way one can define an affine algebraic supergroup as a group object in the category of super algebraic
affine varieties. An affine algebraic supergroup has a similar one to one relation to its Hopf algebra of super
Polynomials. Using the language of schemes, which combines the geometric and algebraic point of view, algebraic
supergroup schemes can be defined including super Abelian varieties.
Superalgebra
62
Superalgebra
In mathematics and theoretical physics, a superalgebra is a Z 2 -graded algebra. ^ That is, it is an algebra over a
commutative ring or field with a decomposition into "even" and "odd" pieces and a multiplication operator that
respects the grading.
The prefix super- comes from the theory of supersymmetry in theoretical physics. Superalgebras and their
representations, supermodules, provide an algebraic framework for formulating supersymmetry. The study of such
objects is sometimes called super linear algebra. Superalgebras also play an important role in related field of
supergeometry where they enter into the definitions of graded manifolds, supermanifolds and superschemes.
Formal definition
Let K be a fixed commutative ring. In most applications, K is a field such as R or C.
A superalgebra over K is a ^-module A with a direct sum decomposition
A — Aq® Ai
together with a bilinear multiplication AxA^A such that
AiAj C i4.£_|_j
where the subscripts are read modulo 2.
A superring, or Z 2 -graded ring, is a superalgebra over the ring of integers Z.
The elements of A. are said to be homogeneous. The parity of a homogeneous element x, denoted by Ixl, is 0 or 1
according to whether it is in A Q or A^. Elements of parity 0 are said to be even and those of parity 1 to be odd. If x
and y are both homogeneous then so is the product xy and \xy\ = \x\ + \y\.
An associative superalgebra is one whose multiplication is associative and a unital superalgebra is one with a
multiplicative identity element. The identity element in a unital superalgebra is necessarily even. Unless otherwise
specified, all superalgebras in this article are assumed to be associative and unital.
A commutative superalgebra is one which satisfies a graded version of commutativity. Specifically, A is
commutative if
yx = (-l)l x H y lxy
for all homogeneous elements x and y of A.
Examples
• Any algebra over a commutative ring K may be regarded as a purely even superalgebra over K; that is, by taking
A 1 to be trivial.
• Any Z or N-graded algebra may be regarded as superalgebra by reading the grading modulo 2. This includes
examples such as tensor algebras and polynomial rings over K.
• In particular, any exterior algebra over K is a superalgebra. The exterior algebra is the standard example of a
supercommutative algebra.
• The symmetric polynomials and alternating polynomials together form a superalgebra, being the even and odd
parts, respectively. Note that this is a different grading from the grading by degree.
• Clifford algebras are superalgebras. They are generally noncommutative.
• The set of all endomorphisms (both even and odd) of a super vector space forms a superalgebra under
composition.
• The set of all square supermatrices with entries in K forms a superalgebra denoted by M^(K). This algebra may
be identified with the algebra of endomorphisms of a free supermodule over K of rank p\q.
Superalgebra
63
• Lie superalgebras are a graded analog of Lie algebras. Lie superalgebras are nonunital and nonassociative;
however, one may construct the analog of a universal enveloping algebra of a Lie superalgebra which is a unital,
associative superalgebra.
Further definitions and constructions
A superalgebra is an algebra with a Z2 grading ("even" and "odd" elements) such that (i) the bracket of two
generators is always antisymmetric except for two odd elements where it is symmetric and (ii) the Jacobi identities
are satisfied
[Ek, {0 fc , 0 6 }] = {[Ei, 0*], O b } + {[Ei, O b ], 0 fe }
[o k , {o b , O a }] = [{O k , O k }, O a ] + [{0 fc , OJ, OJ
The first of these three identities says that the 0 form a representation of the ordinary Lie algebra spanned by E
(Consider the 0 as vectors on which the E act.) The second is equivalent to the first if the Killing form is nonsingular.
The last identity restricts the possible representations 0 of the ordinary Lie algebra. This relation is the reason that
not every ordinary Lie algebra can be extended to a superalgebra.
Even subalgebra
Let A be a superalgebra over a commutative ring K. The submodule A , consisting of all even elements, is closed
under multiplication and contains the identity of A and therefore forms a subalgebra of A, naturally called the even
subalgebra. It forms an ordinary algebra over K.
The set of all odd elements A 1 is an A Q -bimodule whose scalar multiplication is just multiplication in A. The product
in A equips A with a bilinear form
fM : A 1 ® Ao M -> A 0
such that
fi(x <g> y) • z = x • fi(y <g> z)
for all x, y, and z in A . This follows from the associativity of the product in A.
Grade involution
There is a canonical involutive automorphism on any superalgebra called the grade involution. It is given on
homogeneous elements by
x= {-lpx
and on arbitrary elements by
x = Xq — X\
where x are the homogeneous parts of x. If A has no 2-torsion (in particular, if 2 is invertible) then the grade
involution can be used to distinguish the even and odd parts of A:
Ai = {x G A : x = {-lfx}.
Superalgebra
64
Supercommutativity
The supercommutator on A is the binary operator given by
[x,y] = xy-(-l)WMyx
on homogeneous elements. This can be extended to all of A by linearity. Elements x and y of A are said to
supercommute if [x, y] = 0.
The supercenter of A is the set of all elements of A which supercommute with all elements of A :
Z(A) = {a e A : [a, x] = 0 for all x <E A}.
The supercenter of A is, in general, different than the center of A as an ungraded algebra. A commutative
superalgebra is one whose supercenter is all of A.
Super tensor product
The graded tensor product of two superalgebras may be regarded as a superalgebra with a multiplication rule
determined by:
(ai ® 6i)(a 2 ® h) = (-l)^^^^ <g> bfa).
Generalizations and categorical definition
One can easily generalize the definition of superalgebras to include superalgebras over a commutative superring. The
definition given above is then a specialization to the case where the base ring is purely even.
Let R be a commutative superring. A superalgebra over R is a i?-supermodule A with a /^-bilinear multiplication A x
A^> A that respects the grading. Bilinearity here means that
r • (xy) = (r • x)y = (— l)^^^ • y)
for all homogeneous elements r G R and x, y G A.
Equivalently, one may define a superalgebra over R as a superring A together with an superring homomorphism R —>
A whose image lies in the supercenter of A.
One may also define superalgebras categorically. The category of all /?-supermodules forms a monoidal category
under the super tensor product with R serving as the unit object. An associative, unital superalgebra over R can then
be defined as a monoid in the category of /?-supermodules. That is, a superalgebra is an 7?-supermodule A with two
(even) morphisms
/i : A <g> A -> A
rj : R —> A
for which the usual diagrams commute.
Superalgebra
65
Notes
[1] Kac, Martinez & Zelmanov (2001), p. 3 (http://books.google.com/books ?id=jTCNZz2Tk4cC&pg=PA3&dq="superalgebra").
[2] P. van Nieuwenhuizen, Phys. Rep. 68, 189 (1981)
References
• Deligne, Pierre; John W. Morgan (1999). "Notes on Supersymmetry (following Joseph Bernstein)". Quantum
Fields and Strings: A Course for Mathematicians . 1. American Mathematical Society, pp. 41-97. ISBN
0-8218-2012-5.
• Manin, Y. I. (1997). Gauge Field Theory and Complex Geometry ((2nd ed.) ed.). Berlin: Springer.
ISBN 3-540-61378-1.
• Varadarajan, V. S. (2004). Supersymmetry for Mathematicians: An Introduction. Courant Lecture Notes in
Mathematics 11. American Mathematical Society. ISBN 0-8218-3574-2.
• Kac, Victor G.; Martinez, Consuelo; Zelmanov, Efim (2001). Graded simple Jordan superalgebras of growth
one. Memoirs of the AMS Series. 711. AMS Bookstore. ISBN 9780821826454.
Supergravity
In theoretical physics, supergravity (supergravity theory) is a field theory that combines the principles of
supersymmetry and general relativity. Together, these imply that, in supergravity, the supersymmetry is a local
symmetry (in contrast to non-gravitational supersymmetric theories, such as the Minimal Supersymmetric Standard
Model (MSSM)). Since the generators of supersymmetry (SUSY) are convoluted with the Poincare group to form a
Super-Poincare algebra it is very natural to see that SuperGravity follows naturally from super symmetry J ^
Gravitons
Like any field theory of gravity, a supergravity theory contains a spin-2 field whose quantum is the graviton.
Supersymmetry requires the graviton field to have a superpartner. This field has spin 3/2 and its quantum is the
gravitino. The number of gravitino fields is equal to the number of supersymmetries. Supergravity theories are often
said to be the only consistent theories of interacting massless spin 3/2 fields.
History
Four-dimensional SUGRA
SUGRA, or SUper GRAvity, was initially proposed as a four-dimensional theory in 1976 by Daniel Z. Freedman,
Peter van Nieuwenhuizen and Sergio Ferrara at Stony Brook University, but was quickly generalized to many
different theories in various numbers of dimensions and greater number (N) of supersymmetry charges. Supergravity
theories with N>1 are usually referred to as extended supergravity (SUEGRA). Some supergravity theories were
shown to be equivalent to certain higher-dimensional supergravity theories via dimensional reduction (e.g. 1 11
dimensional supergravity is dimensionally reduced on S7 to N = 8 d = 4 SUGRA). The resulting theories were
sometimes referred to as Kaluza-Klein theories, as Kaluza and Klein constructed, nearly a century ago, a
five-dimensional gravitational theory, that when dimensionally reduced on circle, its 4-dimensional non-massive
modes describe electromagnetism coupled to gravity.
Supergravity
66
mSUGRA
mSUGRA means minimal SUper GRAvity. The construction of a realistic model of particle interactions within the N
= 1 supergravity framework where supersymmetry (SUSY) is broken by a super Higgs mechanism was carried out
by Ali Chamseddine, Richard Arnowitt and Pran Nath in 1982. In these classes of models collectively now known as
minimal supergravity Grand Unification Theories (mSUGRA GUT), gravity mediates the breaking of SUSY through
the existence of a hidden sector. mSUGRA naturally generates the Soft SUSY breaking terms which are a
consequence of the Super Higgs effect. Radiative breaking of electroweak symmetry through Renormalization
Group Equations (RGEs) follows as an immediate consequence. mSUGRA is one of the most widely investigated
models of particle physics due to it predictive power requiring only four input parameters and a sign, to determine
the low energy Phenomenology from the scale of Grand Unification.
lid: the maximal SUGRA
One of these supergravities, the 11 -dimensional theory, generated considerable excitement as the first potential
candidate for the theory of everything. This excitement was built on four pillars, two of which have now been largely
discredited:
• Werner Nahm showed that 1 1 dimensions was the largest number of dimensions consistent with a single graviton,
and that a theory with more dimensions would also have particles with spins greater than 2. These problems are
avoided in 12 dimensions if two of these dimensions are timelike, as has been often emphasized by Itzhak Bars.
• In 1981, Ed Witten showed that 1 1 was the smallest number of dimensions that was big enough to contain the
gauge groups of the Standard Model, namely SU(3) for the strong interactions and SU(2) times U(l) for the
electroweak interactions. Today many techniques exist to embed the standard model gauge group in supergravity
in any number of dimensions. For example, in the mid and late 1980s one often used the obligatory gauge
symmetry in type I and heterotic string theories. In type II string theory they could also be obtained by
compactifying on certain Calabi-Yau's. Today one may also use D-branes to engineer gauge symmetries.
• In 1978, Eugene Cremmer, Bernard Julia and Joel Scherk (CJS) of the Ecole Normale Superieure found the
classical action for an 11 -dimensional supergravity theory. This remains today the only known classical
11 -dimensional theory with local supersymmetry and no fields of spin higher than two. Other 11 -dimensional
theories are known that are quantum-mechanically inequivalent to the CJS theory, but classically equivalent (that
is, they reduce to the CJS theory when one imposes the classical equations of motion). For example, in the mid
1980s Bernard de Wit and Hermann Nicolai found an alternate theory in D=ll Supergravity with Local SU(8)
Invariance . This theory, while not manifestly Lorentz-invariant, is in many ways superior to the CJS theory in
that, for example, it dimensionally-reduces to the 4-dimensional theory without recourse to the classical equations
of motion.
• In 1980, Peter G. O. Freund and M. A. Rubin showed that compactification from 11 dimensions preserving all the
SUSY generators could occur in two ways, leaving only 4 or 7 macroscopic dimensions (the other 7 or 4 being
compact). Unfortunately, the noncompact dimensions have to form an anti de Sitter space. Today it is understood
that there are many possible compactifications, but that the Freund-Rubin compactifications are invariant under
all of the supersymmetry transformations that preserve the action.
Thus, the first two results appeared to establish 11 dimensions uniquely, the third result appeared to specify the
theory, and the last result explained why the observed universe appears to be four-dimensional.
Many of the details of the theory were fleshed out by Peter van Nieuwenhuizen, Sergio Ferrara and Daniel Z.
Freedman.
Supergravity
67
The end of the SUGRA era
The initial excitement over 11 -dimensional supergravity soon waned, as various failings were discovered, and
attempts to repair the model failed as well. Problems included:
• The compact manifolds which were known at the time and which contained the standard model were not
compatible with super- symmetry, and could not hold quarks or leptons. One suggestion was to replace the
compact dimensions with the 7-sphere, with the symmetry group SO(8), or the squashed 7-sphere, with symmetry
group SO(5) times SU(2).
• Until recently, the physical neutrinos seen in the real world were believed to be massless, and appeared to be
left-handed, a phenomenon referred to as the chirality of the Standard Model. It was very difficult to construct a
chiral fermion from a compactification — the compactified manifold needed to have singularities, but physics
near singularities did not begin to be understood until the advent of orbifold conformal field theories in the late
1980s.
• Supergravity models generically result in an unrealistically large cosmological constant in four dimensions, and
that constant is difficult to remove, and so require fine-tuning. This is still a problem today.
• Quantization of the theory led to quantum field theory gauge anomalies rendering the theory inconsistent. In the
intervening years physicists have learned how to cancel these anomalies.
Some of these difficulties could be avoided by moving to a 10-dimensional theory involving superstrings. However,
by moving to 10 dimensions one loses the sense of uniqueness of the 11 -dimensional theory.
The core breakthrough for the 10-dimensional theory, known as the first superstring revolution, was a demonstration
by Michael B. Green, John H. Schwarz and David Gross that there are only three supergravity models in 10
dimensions which have gauge symmetries and in which all of the gauge and gravitational anomalies cancel. These
were theories built on the groups SO (3 2) and E% X Eg, the direct product of two copies of E . Today we know
that, using D-branes for example, gauge symmetries can be introduced in other 10-dimensional theories as well.
The second superstring revolution
Initial excitement about the lOd theories, and the string theories that provide their quantum completion, died by the
end of the 1980s. There were too many Calabi-Yaus to compactify on, many more than Yau had estimated, as he
admitted in December 2005 at the 23rd International Solvay Conference in Physics. None quite gave the standard
model, but it seemed as though one could get close with enough effort in many distinct ways. Plus no one understood
the theory beyond the regime of applicability of string perturbation theory.
There was a comparatively quiet period at the beginning of the 1990s, during which, however, several important
tools were developed. For example, it became apparent that the various superstring theories were related by "string
dualities", some of which relate weak string-coupling (i.e. perturbative) physics in one model with strong
string-coupling (i.e. non-perturbative) in another.
Then it all changed, in what is known as the second superstring revolution. Joseph Polchinski realized that obscure
string theory objects, called D-branes, which he had discovered six years earlier, are stringy versions of the p-branes
that were known in supergravity theories. The treatment of these p-branes was not restricted by string perturbation
theory; in fact, thanks to supersymmetry, p-branes in supergravity were understood well beyond the limits in which
string theory was understood.
Armed with this new nonperturbative tool, Edward Witten and many others were able to show that all of the
perturbative string theories were descriptions of different states in a single theory which he named M-theory.
Furthermore he argued that the long wavelength limit of M-theory should be described by the 11 -dimensional
supergravity that had fallen out of favor with the first superstring revolution 10 years earlier, accompanied by the 2-
and 5-branes. [*= i.e. when the quantum wavelength associated to objects in the theory are much larger than the size
of the 1 1th dimension].
Supergravity
68
Historically, then, supergravity has come "full circle". It is a commonly used framework in understanding features of
string theories, M-theory and their compactifications to lower spacetime dimensions.
Relation to superstrings
Particular 10-dimensional supergravity theories are considered "low energy limits" of the 10-dimensional superstring
theories; more precisely, these arise as the massless, tree-level approximation of string theories. True effective field
theories of string theories, rather than truncations, are rarely available. Due to string dualities, the conjectured
11 -dimensional M-theory is required to have 11 -dimensional supergravity as a "low energy limit". However, this
doesn't necessarily mean that string theory/M-theory is the only possible UV completion of supergravity;
supergravity research is useful independent of those relations.
4D/V= 1 SUGRA
Before we move on to SUGRA proper, let's recapitulate some important details about general relativity. We have a
4D differentiable manifold M with a Spin(3,l) principal bundle over it. This principal bundle represents the local
Lorentz symmetry. In addition, we have a vector bundle T over the manifold with the fiber having four real
dimensions and transforming as a vector under Spin(3,l). We have an invertible linear map from the tangent bundle
TM to T. This map is the vierbein. The local Lorentz symmetry has a gauge connection associated with it, the spin
connection.
The following discussion will be in superspace notation, as opposed to the component notation, which isn't
manifestly covariant under SUSY. There are actually many different versions of SUGRA out there which are
inequivalent in the sense that their actions and constraints upon the torsion tensor are different, but ultimately
equivalent in that we can always perform a field redefinition of the supervierbeins and spin connection to get from
one version to another.
In 4D N=l SUGRA, we have a 414 real differentiable supermanifold M, i.e. we have 4 real bosonic dimensions and 4
real fermionic dimensions. As in the nonsupersymmetric case, we have a Spin(3,l) principal bundle over M. We
414
have an R vector bundle T over M. The fiber of T transforms under the local Lorentz group as follows; the four
real bosonic dimensions transform as a vector and the four real fermionic dimensions transform as a Majorana
spinor. This Majorana spinor can be reexpressed as a complex left-handed Weyl spinor and its complex conjugate
right-handed Weyl spinor (they're not independent of each other). We also have a spin connection as before.
We will use the following conventions; the spatial (both bosonic and fermionic) indices will be indicated by M, N, ...
. The bosonic spatial indices will be indicated by \i, v, the left-handed Weyl spatial indices by a, (3,..., and the
right-handed Weyl spatial indices by q , 0 , ... . The indices for the fiber of T will follow a similar notation, except
that they will be hatted like this: , a. • See van der Waerden notation for more details. M = (/i, a, a) . The
supervierbein is denoted by e ^ , and the spin connection by ^mnp- The inverse supervierbein is denoted by
The supervierbein and spin connection are real in the sense that they satisfy the reality conditions
ejf 0, Of = ejT (x : 0, 6) where /i* = fi , a * = a , and a* = a and u;(x, 0, 9)* = uj{x, 9 : d)
The covariant derivative is defined as
D^f = E%{d N f + u N \J]).
The covariant exterior derivative as defined over supermanifolds needs to be super graded. This means that every
time we interchange two fermionic indices, we pick up a +1 sign factor, instead of -1.
The presence or absence of R symmetries is optional, but if R-symmetry exists, the integrand over the full
superspace has to have an R-charge of 0 and the integrand over chiral superspace has to have an R-charge of 2.
Supergravity
69
A chiral superfield X is a superfield which satisfies D^X = 0. In order for this constraint to be consistent, we
require the integrability conditions that = C ^^7^ or some coefficients c.
Unlike nonSUSY GR, the torsion has to be nonzero, at least with respect to the fermionic directions. Already, even
in flat superspace, D&e^ + D^e& 7^ 0. In one version of SUGRA (but certainly not the only one), we have the
following constraints upon the torsion tensor:
T\ = 0
a/3
2f- = 0
T A . = 0
if- = 0
TP- = 0
Here, Q, is a shorthand notation to mean the index runs over either the left or right Weyl spinors.
The superdeterminant of the supervierbein, |e| , gives us the volume factor for M. Equivalently, we have the
volume 414-superform e £=° /\ . . . /\ e £= 3 /\ e <*=l /\ e <*=2 ^ e ^=l ^ g a=2 .
If we complexify the superdiffeomorphisms, there is a gauge where E~ = 0, E? = Oand £^ = The
resulting chiral superspace has the coordinates x and 0.
R is a scalar valued chiral superfield derivable from the supervielbeins and spin connection. If / is any superfield,
(p 2 - 8R) f is always a chiral superfield.
The action for a SUGRA theory with chiral superfields X, is given by
+ c.c.
S = J d*xd 2 Q2£ [| (D 2 - 8fl) e"* ( X ' X V 3 + W(X)
where 7^ is the Kahler potential and W is the superpotential, and £ is the chiral volume factor. Unlike the case for
flat superspace, adding a constant to either the Kahler or superpotential is now physical. A constant shift to the
Kahler potential changes the effective Planck constant, while a constant shift to the superpotential changes the
effective cosmological constant. As the effective Planck constant now depends upon the value of the chiral
superfield X, we need to rescale the supervierbeins (a field redefinition) to get a constant Planck constant. This is
called the Einstein frame.
Supergravity
70
Higher-dimensional SUGRA
See the article higher-dimensional supergravity for more details.
References
Historical
[1] P. van Nieuwenhuizen, Phys. Rep. 68, 189 (1981)
[2] http://ccdb4fs.kek.jp/cgi-bin/img_index78604009
• D.Z. Freedman, P. van Nieuwenhuizen and S. Ferrara, "Progress Toward A Theory Of Supergravity", Physical
Review D13 (1976) pp 3214-3218.
• E. Cremmer, B. Julia and J. Scherk, "Supergravity theory in eleven dimensions", Physics Letters B76 (1978) pp
409-412. scanned version (http://www-lib.kek.jp/cgi-bin/img_index77805106)
• P. Freund and M. Rubin, "Dynamics of dimensional reduction", Physics Letters B97 (1980) pp 233-235.
• Ali H. Chamseddine, R. Arnowitt, Pran Nath, "Locally Supersymmetric Grand Unification", " Phys.
Rev.Lett.49:970,1982"
• Michael B. Green, John H. Schwarz, "Anomaly Cancellation in Supersymmetric D=10 Gauge Theory and
Superstring Theory", Physics Letters B149 (1984) ppl 17-122.
General
• Bernard de Wit(2002) Supergravity (http://arxiv.org/abs/hep-th/0212245vl)
• A Supersymmetry primer (http://arxiv.org/abs/hep-ph/9709356) (1998) updated in (2006), (the user friendly
guide).
• Adel Bilal, Introduction to supersymmetry (http://arxiv.org/hep-th/0101055) (2001) ArXiv hep-th/01 01055, (a
comprehensive introduction to supersymmetry).
• Friedemann Brandt, Lectures on supergravity (http://arxiv.org/abs/hep-th/0204035) (2002) ArXiv
hep-th/0204035, (an introduction to 4 -dimensional N = 1 supergravity).
• Wess, Julius; Bagger, Jonathan (1992). Supersymmetry and Supergravity. Princeton University Press, pp. 260.
ISBN 0691025304.
Quantum statistical mechanics
71
Quantum statistical mechanics
Quantum statistical mechanics is the study of statistical ensembles of quantum mechanical systems. A statistical
ensemble is described by a density operator S, which is a non-negative, self-adjoint, trace-class operator of trace 1 on
the Hilbert space H describing the quantum system. This can be shown under various mathematical formalisms for
quantum mechanics. One such formalism is provided by quantum logic.
Expectation
From classical probability theory, we know that the expectation of a random variable X is completely determined by
its distribution by
Exppf) = / XdD x {X)
assuming, of course, that the random variable is integrable or that the random variable is non-negative. Similarly, let
A be an observable of a quantum mechanical system. A is given by a densely defined self-adjoint operator on H. The
spectral measure of A defined by
E A (U) = [ AdE(A),
Ju
uniquely determines A and conversely, is uniquely determined by A. is a boolean homomorphism from the Borel
subsets of R into the lattice Q of self-adjoint projections of H. In analogy with probability theory, given a state S, we
introduce the distribution of A under S which is the probability measure defined on the Borel subsets of R by
D A (U) = Tr(E A (U)S).
Similarly, the expected value of A is defined in terms of the probability distribution by
Exp (4) = [ XdB A (X).
Note that this expectation is relative to the mixed state S which is used in the definition of D^.
Remark. For technical reasons, one needs to consider separately the positive and negative parts of A defined by the
Borel functional calculus for unbounded operators.
One can easily show:
Exp(A) = Tr(AS) = Tr{SA).
Note that if S is a pure state corresponding to the vector ip,
Exp(^) = (1>\A\1>).
Von Neumann entropy
Of particular significance for describing randomness of a state is the von Neumann entropy of S formally defined by
H(5) = -Tr(51og 2 5).
Actually, the operator S log 2 S is not necessarily trace-class. However, if S is a non-negative self-adjoint operator not
of trace class we define Tr(S) = +00. Also note that any density operator S can be diagonalized, that it can be
represented in some orthonormal basis by a (possibly infinite) matrix of the form
Ai
0 ••
• 0
0
A 2 ••
• 0
0
0 ■■
■ K
Quantum statistical mechanics
72
and we define
H(5) = -J]A i log 2 A i .
i
The convention is that 0 log 2 0 = 0, since an event with probability zero should not contribute to the entropy. This
value is an extended real number (that is in [0, <»]) and this is clearly a unitary invariant of S.
Remark. It is indeed possible that H(S) = +°« for some density operator S. In fact The the diagonal matrix
n o
0
T =
2(log 2 2)2
0
0
1
0
3(log 2 3)2
0
n(log 2 n) 2
T is non-negative trace class and one can show T log 2 T is not trace-class.
Theorem. Entropy is a unitary invariant.
In analogy with classical entropy (notice the similarity in the definitions), H(S) measures the amount of randomness
in the state S. The more dispersed the eigenvalues are, the larger the system entropy. For a system in which the space
H is finite-dimensional, entropy is maximized for the states S which in diagonal form have the representation
r ± 0 ••• 0"
0 I
0
0 0 ••• i
n.
For such an S, H(S) = log 2 n. The state S is called the maximally mixed state.
Recall that a pure state is one of the form
s = |v) (VI,
for a vector of norm 1 .
Theorem. R(S) = 0 if and only if S is a pure state.
For S is a pure state if and only if its diagonal form has exactly one non-zero entry which is a 1 .
Entropy can be used as a measure of quantum entanglement.
Gibbs canonical ensemble
Consider an ensemble of systems described by a Hamiltonian H with average energy E. If H has pure-point spectrum
and the eigenvalues E n of H go to + °o sufficiently fast, e will be a non-negative trace-class operator for every
positive r.
The Gibbs canonical ensemble is described by the state
where (3 is such that the ensemble average of energy satisfies
Tt{SH) = E
,and
Tr(e-H = 2>-^
n
is the quantum mechanical version of the canonical partition function. The probability that a system chosen at
random from the ensemble will be in a state corresponding to energy eigenvalue is
Quantum statistical mechanics
73
Under certain conditions, the Gibbs canonical ensemble maximizes the von Neumann entropy of the state subject to
the energy conservation requirement.
References
• J. von Neumann, Mathematical Foundations of Quantum Mechanics, Princeton University Press, 1955.
• F. Reif, Statistical and Thermal Physics, McGraw-Hill, 1965.
Quantum thermodynamics
In the physical sciences, quantum thermodynamics is the study of heat and work dynamics in quantum systems.
Approximately, quantum thermodynamics attempts to combine thermodynamics and quantum mechanics into a
coherent whole. The essential point at which "quantum mechanics" began was when, in 1900, Max Planck outlined
the "quantum hypothesis", i.e. that the energy of atomic systems can be quantized, as based on the first two laws of
thermodynamics as described by Rudolf Clausius (1865) and Ludwig Boltzmann (1877).^ ^ See the history of
quantum mechanics for a more detailed outline.
Overview
A central objective in quantum thermodynamics is the quantitative and qualitative determination of the laws of
thermodynamics at the quantum level in which uncertainty and probability begin to take effect. A fundamental
question is: what remains of thermodynamics if one goes to the extreme limit of small quantum systems having a
few degrees of freedom? If thermodynamics applies at this level, does the second law of thermodynamics remain
unchanged, or is there a more universal formulation than the many existing formulations, such as: the entropy of a
closed system cannot decrease; heat flows from high to low temperature; systems evolve towards minimum potential
energy wells; energy tends to dissipate; and so on.
References
[1] Planck, Max. (1900). "Entropy and Temperature of Radiant Heat (http://www.iee.org/publish/inspec/prodcat/1900A01446.xml). "
Annalen der Physick, vol. 1. no 4. April, pg. 719-37.
[2] Planck, Max. (1901). " On the Law of Distribution of Energy in the Normal Spectrum (http://dbhs.wvusd.kl2.ca.us/webdocs/
Chem-History/Planck-1901/Planck-1901.html)". Annalen der Physik, vol. 4, p. 553 ff.
Further reading
1. Gemmer, J., Michel, M., Mahler, G. (2005). Quantum Thermodynamics — Emergence of Thermodynamic
Behavior Within Composite Quantum Systems. Springer. ISBN 3-540-22911-6.
2. Rudakov, E.S. (1998). Molecular, Quantum and Evolution Thermodynamics: Development and Specialization of
the Gibbs Method.. Donetsk State University Press. ISBN 966-02-0708-5.
Quantum thermodynamics
74
External links
• Quantum Thermodynamics and the Gibbs Paradox (http://staff.science.uva.nl/~nieuwenh/QL2L.html)
• Quantum Thermodynamics (http://www.chaos.org.uk/~eddy/physics/heat.html)
• On the Classical Limit of Quantum Thermodynamics in Finite Time (http://www.fh.huji.ac.il/~ronnie/
Paper s/ge va92.pdf) [PDF-format]
• Quantum Thermodynamics (http://www.quantumthermodynamics.org) - list of good related articles
Supertheory
The theory of everything (TOE) is a putative theory of theoretical physics that fully explains and links together all
known physical phenomena, and, ideally, has predictive power for the outcome of any experiment that could be
carried out in principle.
Initially, the term was used with an ironic connotation to refer to various overgeneralized theories. For example, a
great-grandfather of Ijon Tichy — a character from a cycle of Stanislaw Lem's science fiction stories of the
1960s — was known to work on the "General Theory of Everything". Physicist John Ellis ^ claims to have introduced
the term into the technical literature in an article in Nature in 1986. Over time, the term stuck in popularizations of
quantum physics to describe a theory that would unify or explain through a single model the theories of all
fundamental interactions and of all particles of nature: general relativity for gravitation, and the standard model of
elementary particle physics - which includes quantum mechanics - for electromagnetism, the two nuclear
interactions, and the known elementary particles.
There have been many theories of everything proposed by theoretical physicists over the twentieth century, but none
have been confirmed experimentally. The primary problem in producing a TOE is that the accepted theories of
quantum mechanics and general relativity are hard to combine. Their mutual incompatibility makes their unification
a difficult task. The combination is one of the unsolved problems in physics.
Based on theoretical holographic principle arguments from the 1990s, many physicists believe that 11 -dimensional
M-theory, which is described in many sectors by matrix string theory, in many other sectors by perturbative string
theory, is the complete theory of everything. However, there is no widespread consensus on this issue, because
M-theory is not a completed theory but rather an approach for producing one.
Historical antecedents
Ancient Greece to Einstein
Archimedes was possibly the first scientists to describe nature with axioms (or principles) and then to deduce new
results from them. The putative theory of everything is expected to achieve the deduction of all phenomena from
basic axioms.
Since ancient Greek times, philosophers have speculated that the apparent diversity of appearances conceals an
underlying unity, and thus that the list of forces might be short, indeed might contain only a single entry. For
example, the mechanical philosophy of the 17th century posited that all forces could be ultimately reduced to contact
forces between tiny solid particles. This was abandoned after the acceptance of Isaac Newton's long-distance force
of gravity; but at the same time, Newton's work in his Principia provided the first dramatic empirical evidence for
the unification of apparently distinct forces: Galileo's work on terrestrial gravity, Kepler's laws of planetary motion,
and the phenomenon of tides were all quantitatively explained by a single law of universal gravitation.
Building on these results, Laplace famously suggested that a sufficiently powerful intellect could, if it knew the
position and velocity of every particle at a given time, along with the laws of nature, calculate the position of any
Supertheory
75
particle at any other time:
An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all
items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it
would embrace in a single formula the movements of the greatest bodies of the universe and those of the
tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present
before its eyes.
— Essai philosophique sur les probabilites, Introduction. 1814
Modern quantum mechanics implies that uncertainty is inescapable, and thus that Laplace's vision cannot be correct.
A theory of everything this must include quantum mechanics and gravitation.
In 1820, Hans Christian 0rsted discovered a connection between electricity and magnetism, triggering decades of
work that culminated in James Clerk Maxwell's theory of electromagnetism. Also during the 19th and early 20th
centuries, it gradually became apparent that many common examples of forces — contact forces, elasticity, viscosity,
friction, pressure — resulted from electrical interactions between the smallest particles of matter.
In the late 1920s, the new quantum mechanics showed that the chemical bonds between atoms were examples of
(quantum) electrical forces, justifying Dirac's boast that "the underlying physical laws necessary for the
mathematical theory of a large part of physics and the whole of chemistry are thus completely known" ^
Attempts to unify gravity with electromagnetism date back at least to Michael Faraday's experiments of 1 849-50
After Albert Einstein's theory of gravity (general relativity) was published in 1915, the search for a unified field
theory combining gravity with electromagnetism began in earnest. At the time, it seemed plausible that no other
fundamental forces exist. Prominent contributors were Gunnar Nordstrom, Hermann Weyl, Arthur Eddington,
Theodor Kaluza, Oskar Klein, and most notably, many attempts by Einstein and his collaborators. In his last years,
Albert Einstein was intensely occupied in finding such a unifying theory. None of these attempts was successful. ^
The nuclear interactions
In the twentieth century, the search for a unifying theory was interrupted by the discovery of the strong and weak
nuclear forces (or interactions), which differ both from gravity and from electromagnetism. A further hurdle was the
acceptance that quantum mechanics had to be incorporated from the start, rather than emerging as a consequence of a
deterministic unified theory, as Einstein had hoped.
Gravity and electromagnetism could always peacefully coexist as entries in a list of classical forces, but for many
years it seemed that gravity could not even be incorporated into the quantum framework, let alone unified with the
other fundamental forces. For this reason, work on unification, for much of the twentieth century, focused on
understanding the three "quantum" forces: electromagnetism and the weak and strong forces. The first two were
combined in 1967-68 by Sheldon Glashow, Steven Weinberg, and Abdus Salam into the "electro weak" force.
However, while the strong and electroweak forces peacefully coexist in the Standard Model of particle physics, they
remain distinct.
Electroweak unification is a broken symmetry: the electromagnetic and weak forces appear distinct at low energies
2 2
because the particles carrying the weak force, the W and Z bosons, with masses of 80.4 GeV/c and 91.2 GeV/c ,
whereas the photon, which carries the electromagnetic force, is massless. At higher energies Ws and Zs can be
created easily and the unified nature of the force becomes apparent.
Several Grand Unified Theories (GUTs) have been proposed to unify electromagnetism and the weak and strong
forces. Grand unification is expected to set in at energies of the order of 10 16 GeV, far greater than could be reached
by any possible Earth-based particle accelerator. Although the simplest GUTs have been experimentally ruled out,
the general idea, especially when linked with supersymmetry, remains a favorite candidate in the theoretical physics
community.
Supertheory
76
Modern physics
The conventional pattern of theories
A Theory of Everything would unify all the fundamental interactions of nature: gravitation, strong interaction, weak
interaction, and electromagnetism. Because the weak interaction can transform elementary particles from one kind
into another, the TOE should also yield a deep understanding of the various different kinds of possible particles. The
usual assumed path of theories is given in the following graph, where each unification step leads one level higher:
Theory of
Everything
Gravitation Electronuclear force
(GUT)
Strong Electroweak
interaction force
SU(3) SU(2) x U(l)
Weak Electromagnetism
interaction U(l)
SU(2)
Electricity Magnetism
In this graph, electroweak unification occurs at around 100 GeV, grand unification is predicted to occur at 10 16 GeV,
19
and unification of the GUT force with gravity is expected at the Planck energy, roughly 10 GeV.
In addition to the forces listed in the graph, a TOE must also explain the status of at least to candidate forces
suggested by modern cosmology: an inflationary force and dark energy. Furthermore, cosmological experiments also
suggest the existence of dark matter, supposedly composed of fundamental particles outside the scheme of the
standard model. However, the existence of these forces and particles has not been proven yet.
It may seem premature to be searching for a TOE when there is as yet no direct evidence for an electronuclear force,
and while in any case there are many different proposed GUTs for this force. Nevertheless, most physicists believe
that a GUT is possible, mainly due to the past history of convergence towards a single theory. Super symmetric
GUTs seem plausible not only for their theoretical "beauty", but because they naturally produce large quantities of
dark matter, and the inflationary force may be related to GUT physics (although it does not seem to form an
inevitable part of the theory). Yet GUTs are clearly not the final answer. Both the current standard model and all
proposed GUTs are quantum field theories which require the problematic technique of renormalization to yield
sensible answers. This is usually regarded as a sign that these are only effective field theories, omitting crucial
phenomena relevant only at very high energies. Furthermore, the inconsistency between quantum mechanics and
general relativity implies that one or both of these must be replaced by a theory incorporating quantum gravity.
String theory and M-theory
The mainstream theory of everything at the moment is superstring theory / M-theory. These theories attempt to deal
with the renormalization problem by setting up some lower bound on the length scales possible.
String theories and supergravity (both believed to be limiting cases of the yet-to-be-defined M-theory) suppose that
the universe actually has more dimensions than the easily observed three of space and one of time. The motivation
behind this approach began with the Kaluza-Klein theory in which it was noted that applying general relativity to a
five dimensional universe (with the usual four dimensions plus one small curled-up dimension) yields the equivalent
of the usual general relativity in four dimensions together with Maxwell's equations (electromagnetism, also in four
dimensions). This has led to efforts to work with theories with large number of dimensions in the hopes that this
Supertheory
77
would produce equations that are similar to known laws of physics. The notion of extra dimensions also helps to
resolve the hierarchy problem, which is the question of why gravity is so much weaker than any other force. The
common answer involves gravity leaking into the extra dimensions in ways that the other forces do not.
In the late 1990s, it was noted that one problem with several of the candidates for theories of everything (but
particularly string theory) was that they did not constrain the characteristics of the predicted universe. For example,
many theories of quantum gravity can create universes with arbitrary numbers of dimensions or with arbitrary
cosmological constants. Even the "standard" ten-dimensional string theory allows the "curled up" dimensions to be
500
compactified in an enormous number of different ways (one estimate is 10 ) each of which corresponds to a
different collection of fundamental particles and low-energy forces. This array of theories is known as the string
theory landscape.
A speculative solution is that many or all of these possibilities are realised in one or another of a huge number of
universes, but that only a small number of them are habitable, and hence the fundamental constants of the universe
are ultimately the result of the anthropic principle rather than a consequence of the theory of everything. This
anthropic approach is often criticised in that, because the theory is flexible enough to encompass almost any
observation, it cannot make useful (as in original, falsifiable, and verifiable) predictions. In this view, string theory
would be considered a pseudoscience, where an unfalsifiable theory is constantly adapted to fit the experimental
results.
Loop quantum gravity
Current research on loop quantum gravity may eventually play a fundamental role in a TOE, but that is not its
primary aim. Loop quantum gravity is facing difficulties in incorporating electromagnetism and the nuclear
interactions.
Other attempts
Any TOE must include general relativity and the standard model of particle physics. Outside the previously
mentioned attempts, the best-known one is Garrett Lisi's E8 proposal.
Present status
At present, no convincing candidate for a TOE is available. Most particle physicists tend to state that the outcome of
the ongoing experiments at the large particle accelerators, the LCH and the Tevatron, are needed in order to provide
theoreticians with precise input for such a theory.
Arguments against a theory of everything
Godel's incompleteness theorem
A small number of scientists claim that Godel's incompleteness theorem proves that any attempt to construct a TOE
is bound to fail. Godel's theorem, informally stated, asserts that any formal theory expressive enough for elementary
arithmetical facts to be expressed and strong enough for them to be proved is either inconsistent (both a statement
and its denial can be derived from its axioms) or incomplete, in the sense that there is a true statement about natural
numbers that can't be derived in the formal theory. In his 1966 book The Relevance of Physics, Stanley Jaki pointed
out that, because any "theory of everything" will certainly be a consistent non-trivial mathematical theory, it must be
incomplete. He claims that this dooms searches for a deterministic theory of everything. 1 In a later reflection, Jaki
states that it is wrong to say that a final theory is impossible, but rather that "when it is on hand one cannot know
rigorously that it is a final theory." ^
Freeman Dyson has stated that
Supertheory
78
Godel's theorem implies that pure mathematics is inexhaustible. No matter how many problems we solve, there will always be other problems
that cannot be solved within the existing rules. [...] Because of Godel's theorem, physics is inexhaustible too. The laws of physics are a finite
set of rules, and include the rules for doing mathematics, so that Godel's theorem applies to them.
— NYRB, May 13,2004
Stephen Hawking was originally a believer in the Theory of Everything but, after considering Godel's Theorem,
concluded that one was not obtainable.
Some people will be very disappointed if there is not an ultimate theory, that can be formulated as a finite number of principles. I used to
belong to that camp, but I have changed my mind.
— Godel and the end of physics [1 1 \ July 20, 2002
Jurgen Schmidhuber (1997) has argued against this view; he points out that Godel's theorems are irrelevant for
computable physics. In 2000, Schmidhuber explicitly constructed limit-computable, deterministic universes
whose pseudo-randomness based on undecidable, Godel-like halting problems is extremely hard to detect but does
not at all prevent formal TOEs describable by very few bits of information J ^
Related critique was offered by Solomon FefermanJ 15 ^ among others. Douglas S. Robertson offers Conway's game
of life as an example: ^ The underlying rules are simple and complete, but there are formally undecidable questions
about the game's behaviors. Analogously, it may (or may not) be possible to completely state the underlying rules of
physics with a finite number of well-defined laws, but there is little doubt that there are questions about the behavior
of physical systems which are formally undecidable on the basis of those underlying laws.
Since most physicists would consider the statement of the underlying rules to suffice as the definition of a "theory of
everything", these researchers argue that Godel's Theorem does not mean that a TOE cannot exist. On the other
hand, the physicists invoking Godel's Theorem appear, at least in some cases, to be referring not to the underlying
rules, but to the understandability of the behavior of all physical systems, as when Hawking mentions arranging
blocks into rectangles, turning the computation of prime numbers into a physical question. 1 This definitional
discrepancy may explain some of the disagreement among researchers.
Another approach to working with the limits of logic implied by Godel's incompleteness theorems is to abandon the
n 8i
attempt to model reality using a formal system altogether. Process Physics is a notable example of a candidate
TOE that takes this approach, where reality is modeled using self-organizing (purely semantic) information.
Fundamental limits in accuracy
No physical theory to date is believed to be precisely accurate. Instead, physics has proceeded by a series of
"successive approximations" allowing more and more accurate predictions over a wider and wider range of
phenomena. Some physicists believe that it is therefore a mistake to confuse theoretical models with the true nature
of reality, and hold that the series of approximations will never terminate in the "truth". Einstein himself expressed
this view on occasions J 1 9 ^ On this view, we may reasonably hope for a theory of everything which self-consistently
incorporates all currently known forces, but should not expect it to be the final answer.
On the other hand it is often claimed that, despite the apparently ever-increasing complexity of the mathematics of
each new theory, in a deep sense associated with their underlying gauge symmetry and the number of fundamental
physical constants, the theories are becoming simpler. If so, the process of simplification cannot continue
indefinitely.
Supertheory
79
Lack of fundamental laws
There is a philosophical debate within the physics community as to whether a theory of everything deserves to be
called the fundamental law of the universe P 0 ^ One view is the hard reductionist position that the TOE is the
fundamental law and that all other theories that apply within the universe are a consequence of the TOE. Another
view is that emergent laws (called "free floating laws" by Steven Weinberg), which govern the behavior of complex
systems, should be seen as equally fundamental. Examples are the second law of thermodynamics and the theory of
natural selection. The point being that, although in our universe these laws describe systems whose behaviour could
("in principle") be predicted from a TOE, they would also hold in universes with different low-level laws, subject
only to some very general conditions. Therefore it is of no help, even in principle, to invoke low-level laws when
discussing the behavior of complex systems. Some argue that this attitude would violate Occam's Razor if a
completely valid TOE were formulated. It is not clear that there is any point at issue in these debates (e.g., between
Steven Weinberg and Philip Anderson) other than the right to apply the high-status word "fundamental" to their
respective subjects of interest.
Impossibility of being "of everything"
Although the name "theory of everything" suggests the determinism of Laplace's quotation, this gives a very
misleading impression. Determinism is frustrated by the probabilistic nature of quantum mechanical predictions, by
the extreme sensitivity to initial conditions that leads to mathematical chaos, and by the extreme mathematical
difficulty of applying the theory. Thus, although the current standard model of particle physics "in principle" predicts
all known non-gravitational phenomena, in practice only a few quantitative results have been derived from the full
theory (e.g., the masses of some of the simplest hadrons), and these results (especially the particle masses which are
most relevant for low-energy physics) are less accurate than existing experimental measurements. The true TOE
would almost certainly be even harder to apply. The main motive for seeking a TOE, apart from the pure intellectual
satisfaction of completing a centuries-long quest, is that all prior successful unifications have predicted new
phenomena, some of which (e.g., electrical generators) have proved of great practical importance. As in other cases
of theory reduction, the TOE would also allow us to confidently define the domain of validity and residual error of
low-energy approximations to the full theory which could be used for practical calculations.
Theory of everything and philosophy
The philosophical implication of a physical TOE are frequently debated. For example, if physicalism is true, a
physical TOE will coincide with a philosophical theory of everything. Some philosophers (Aristotle, Plato, Hegel,
Whitehead, et al.) have attempted to construct all-encompassing systems. Others are highly dubious about the very
possibility of such an exercise.
Stephen Hawking wrote in A Brief History of Time that even if we had a TOE, it would necessarily be a set of
equations. He wrote, "What is it that breathes fire into the equations and makes a universe for them to describe?" P ^
While on his deathbed, Einstein still explored equations that he imagined to be candidates of a unified theory. Of
course, the question would then be "why those equations?" One possible solution might be to adopt the point of view
of ultimate ensemble, or modal realism, and say that those equations are not unique. Others doubt that the theory of
everything will be in the form of equations at all.
Supertheory
80
References
[I] Ellis, John (2002). "Physics gets physical (correspondence)". Nature 415: 957.
[2] Ellis, John (1986). "The Superstring: Theory of Everything, or of Nothing?". Nature 323: 595-598. doi:10.1038/323595a0.
[3] Shapin, Steven (1996). The Scientific Revolution. University of Chicago Press. ISBN 0226750213.
[4] Dirac, P.A.M. (1929). "Quantum mechanics of many-electron systems". Proceedings of the Royal Society of London A 123: 714.
doi: 10. 1098/rspa. 1929.0094.
[5] Faraday, M. (1850). "Experimental Researches in Electricity. Twenty-Fourth Series. On the Possible Relation of Gravity to Electricity".
Abstracts of the Papers Communicated to the Royal Society of London 5: 994-995. doi: 10.1098/rspl. 1843. 0267.
[6] Pais (1982), Ch. 17.
[7] Weinberg (1993), Ch. 5
[8] Potter, Franklin (15 February 2005). "Leptons And Quarks In A Discrete Spacetime" (http://www.sciencegems.com/discretespace.pdf).
Frank Potter's Science Gems. . Retrieved 2009-12-01.
[9] Jaki, S.L. (1966). The Relevance of Physics. Chicago Press.
[10] Stanley L. Jaki (2004) " A Late Awakening to Godel in Physics (http://www.sljaki.com/JakiGodel.pdf)," p. 8-9.
[II] http://www.damtp.cam.ac.uk/strings02/dirac/hawking/
[12] Schmidhuber, Jtirgen (1997). A Computer Scientist's View of Life, the Universe, and Everything. Lecture Notes in Computer Science (http://
www.idsia.ch/~juergen/everything/). Springer, pp. 201-208. doi:10.1007/BFb0052071. ISBN 978-3-540-63746-2. .
[13] Schmidhuber, Jtirgen (2000). "Algorithmic Theories of Everything". arXiv:quant-ph/001 1 122 [quant-ph].
[14] Schmidhuber, Jtirgen (2002). "Hierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in
the limit". International Journal of Foundations of Computer Science 13 (4): 587-612. doi:10.1142/S0129054102001291.
[15] Feferman, Solomon (17 November 2006). "The nature and significance of Godel's incompleteness theorems" (http://math.stanford.edu/
-feferman/papers/Godel-IAS.pdf). Institute for Advanced Study. . Retrieved 2009-01-12.
[16] Robertson, Douglas S. (2007). "Goedel's Theorem, the Theory of Everything, and the Future of Science and Mathematics". Complexity 5:
22-27. doi:10.1002/1099-0526(200005/06)5:5<22::AID-CPLX4>3.0.CO;2-0.
[17] Hawking, Stephen (20 July 2002). "Godel and the end of physics" (http://www.damtp.cam.ac.uk/strings02/dirac/hawking/). .
Retrieved 2009-12-01.
[18] Cahill, Reginald (2003). "Process Physics" (http://www.ctr4process.org/publications/ProcessStudies/PSS/
2003-5-CahillR-Process_Physics.shtml). Process Studies Supplement. Center for Process Studies, pp. 1—131. . Retrieved 2009-07-14.
[19] Einstein, letter to Felix Klein, 1917. (On determinism and approximations.) Quoted in Pais (1982), Ch. 17.
[20] Weinberg (1993), Ch 2.
[21] as quoted in [Artigas, The Mind of the Universe, p. 123]
• John D. Barrow, Theories of Everything: The Quest for Ultimate Explanation (OUP, Oxford, 1990) ISBN
0-099-98380-X
• Stephen Hawking 'The Theory of Everything: The Origin and Fate of the Universe' is an unauthorized 2002 book
taken from recorded lectures (ISBN 1-893224-79-1)
• Stanley Jaki OSB, 2005. The Drama of Quantities. Real View Books (ISBN 1-892548-47-X)
• Abraham Pais Subtle is the Lord...: The Science and the Life of Albert Einstein (OUP, Oxford, 1982). ISBN
0-19-853907-X
• John Thompson "Nature's Watchmaker: The Undiscovered Miracle of Time". (Blackhall Publishing Ltd. Ireland,
2009) ISBN 1842181742 (http://natureswatchmaker.com)
• Steven Weinberg Dreams of a Final Theory: The Search for the Fundamental Laws of Nature (Hutchinson
Radius, London, 1993) ISBN 0-09-1773954
External links
• The Elegant Universe-Nova online (http://www.pbs.org/wgbh/nova/elegant/program.html) — a 3 hour PBS
show about the search for the Theory of everything and string theory.
• 'Theory of Everything' (http://www.vega.org.Uk/video/programme/7) Freeview video by the Vega Science
Trust and the BBC/OU.
81
Quantum Algebra
Quantum algebra
Quantum algebra is one of the top-level mathematics categories used by the arXiv.
Subjects include:
• Quantum groups
• Skein theories
• Operadic algebra
• Diagrammatic algebra
• Quantum field theory
External links
• Quantum algebra at arxiv.org ^
References
[ 1 ] http :// arxi v . org/list/ math . Q A/ current
Lie algebra
In mathematics, a Lie algebra (pronounced /' li :/ ("lee"), not /'lai/ ("lye")) is an algebraic structure whose main use is
in studying geometric objects such as Lie groups and differentiable manifolds. Lie algebras were introduced to study
the concept of infinitesimal transformations. The term "Lie algebra" (after Sophus Lie) was introduced by Hermann
Weyl in the 1930s. In older texts, the name "infinitesimal group" is used.
Definition and first properties
A Lie algebra is a vector space gover some field F together with a binary operation [•, •]
M : S X fl ->fl
called the Lie bracket, which satisfies the following axioms:
• Bilinearity:
[ax + by,z]= a[x, z] + %, z], [z, ax + by] = a[z, x] + b[z, y]
for all scalars a, b in F and all elements x, y, z in 0 .
• Alternating on Q :
[x, x] = 0
for all x in 0. This implies anticommutativity, or skew-symmetry (in fact the conditions are equivalent for any
Lie algebra over any field whose characteristic is not 2):
for all elements x, y in 0 .
• The Jacobi identity:
Lie algebra
82
fc, [y, z\] + [y, [z, x\] + [z, [x, y]] = 0
for all x, y, z in 0 .
For any associative algebra A with multiplication * , one can construct a Lie algebra L(A). As a vector space, L(A) is
the same as A. The Lie bracket of two elements of L(A) is defined to be their commutator in A:
[a, b] = a * b — b * a.
The associativity of the multiplication * in A implies the Jacobi identity of the commutator in L{A). In particular, the
associative algebra of n x n matrices over a field F gives rise to the general linear Lie algebra g[ n ( J F).The
associative algebra A is called an enveloping algebra of the Lie algebra L(A). It is known that every Lie algebra can
be embedded into one that arises from an associative algebra in this fashion. See universal enveloping algebra.
Homomorphisms, subalgebras, and ideals
The Lie bracket is not an associative operation in general, meaning that [[x, y], zjneed not equal [x, [y, z]].
Nonetheless, much of the terminology that was developed in the theory of associative rings or associative algebras is
commonly applied to Lie algebras. A subspace f) C fjthat is closed under the Lie bracket is called a Lie
subalgebra. If a subspace J C q satisfies a stronger condition that
then / is called an ideal in the Lie algebra 0.^ A Lie algebra in which the commutator is not identically zero and
which has no proper ideals is called simple. A homomorphism between two Lie algebras (over the same ground
field) is a linear map that is compatible with the commutators:
/:0^0', f([x,y]) = [f(x),f(y)],
for all elements x and y in 0. As in the theory of associative rings, ideals are precisely the kernels of
homomorphisms, given a Lie algebra 0and an ideal / in it, one constructs the factor algebra q/I , and the first
isomorphism theorem holds for Lie algebras. Given two Lie algebras 0and g* , their direct sum is the vector space
0 © fl' consisting of the pairs (x, x ! \ x £ 0, x ! G 0 ; , with the operation
[(x,x') 5 (y,y')] = (kvL W,y'])i *>y£& *',y' £ a'-
Examples
• Any vector space V endowed with the identically zero Lie bracket becomes a Lie algebra. Such Lie algebras are
called abelian, cf. below. Any one-dimensional Lie algebra over a field is abelian, by the antisymmetry of the Lie
bracket.
• The three-dimensional Euclidean space R with the Lie bracket given by the cross product of vectors becomes a
three-dimensional Lie algebra.
• The Heisenberg algebra is a three-dimensional Lie algebra with generators (see also the definition at Generating
set):
(010\ /000\ /001\
000, y=001, z=000,
000/ \000/ \000/
whose commutation relations are
It is explicitly exhibited as the space of 3x3 strictly upper- triangular matrices.
• The subspace of the general linear Lie algebra gl n (F) consisting of matrices of trace zero is a subalgebra, ^ the
special linear Lie algebra, denoted sl n (F).
Lie algebra
83
• Any Lie group G defines an associated real Lie algebra q = Lie(G) . The definition in general is somewhat
technical, but in the case of real matrix groups, it can be formulated via the exponential map, or the matrix
exponent. The Lie algebra 0 consists of those matrices X for which
exp(£X) G G
for all real numbers t. The Lie bracket of 0is given by the commutator of matrices. As a concrete example,
consider the special linear group SL(n,R), consisting of all nxn matrices with real entries and determinant 1.
This is a matrix Lie group, and its Lie algebra consists of all n x n matrices with real entries and trace 0.
• The real vector space of dXlnxn skew-hermitian matrices is closed under the commutator and forms a real Lie
algebra denoted u(n) . This is the Lie algebra of the unitary group U(n).
• An important class of infinite-dimensional real Lie algebras arises in differential topology. The space of smooth
vector fields on a differentiable manifold M forms a Lie algebra, where the Lie bracket is defined to be the
commutator of vector fields. One way of expressing the Lie bracket is through the formalism of Lie derivatives,
which identifies a vector field X with a first order partial differential operator L acting on smooth functions by
letting L if) be the directional derivative of the function /in the direction of X. The Lie bracket [X,Y] of two
x
vector fields is the vector field defined through its action on functions by the formula:
L[X,Y]f = Lx(Lyf) - Ly(L X f).
This Lie algebra is related to the pseudogroup of diffeomorphisms of M.
• The commutation relations between the x, y, and z components of the angular momentum operator in quantum
mechanics form a representation of a complex three-dimensional Lie algebra, which is the complexification of the
Lie algebra so(3) of the three-dimensional rotation group:
[L x , L y ] = ihL z
[Lyj L z ] = ihL x
[L z , LJ = ihLy
• Kac-Moody algebra is an example of an infinite-dimensional Lie algebra.
Structure theory and classification
Every finite-dimensional real or complex Lie algebra has a faithful representation by matrices (Ado's theorem). Lie's
fundamental theorems describe a relation between Lie groups and Lie algebras. In particular, any Lie group gives
rise to a canonically determined Lie algebra (concretely, the tangent space at the identity), and conversely, for any
Lie algebra there is a corresponding connected Lie group (Lie's third theorem). This Lie group is not determined
uniquely, however, any two connected Lie groups with the same Lie algebra are locally isomorphic, and in
particular, have the same universal cover. For instance, the special orthogonal group SO(3) and the special unitary
group SU(2) give rise to the same Lie algebra, which is isomorphic to R with the cross-product, and SU(2) is a
simply-connected twofold cover of SO(3). Real and complex Lie algebras can be classified to some extent, and this
is often an important step toward the classification of Lie groups.
Abelian, nilpotent, and solvable
Analogously to abelian, nilpotent, and solvable groups, defined in terms of the derived subgroups, one can define
abelian, nilpotent, and solvable Lie algebras.
A Lie algebra 0is abelian if the Lie bracket vanishes, i.e. [x,y] = 0, for all x and y in 0. Abelian Lie algebras
correspond to commutative (or abelian) connected Lie groups such as vector spaces J{ n or tori 7™, and are all of
the form t™, meaning an ^-dimensional vector space with the trivial Lie bracket.
A more general class of Lie algebras is defined by the vanishing of all commutators of given length. A Lie algebra 0
is nilpotent if the lower central series
Lie algebra
84
0 > [0>0] > [[0,0], 0] > [[[0 3 0]»0] 3 0] >
becomes zero eventually. By Engel's theorem, a Lie algebra is nilpotent if and only if for every u in 0the adjoint
endomorphism
ad(it) : g — > g, did(u)v = [it, v]
is nilpotent.
More generally still, a Lie algebra 0is said to be solvable if the derived series:
0> [0,0] > [[0,0], [0,0]] > [[[0,0], [0,0]], [[0,0], [0,0]]] >
becomes zero eventually.
Every finite-dimensional Lie algebra has a unique maximal solvable ideal, called its radical. Under the Lie
correspondence, nilpotent (respectively, solvable) connected Lie groups correspond to nilpotent (respectively,
solvable) Lie algebras.
Simple and semisimple
A Lie algebra is "simple" if it has no non-trivial ideals and is not abelian. A Lie algebra 0is called semisimple if its
radical is zero. Equivalently, 0is semisimple if it does not contain any non-zero abelian ideals. In particular, a
simple Lie algebra is semisimple. Conversely, it can be proven that any semisimple Lie algebra is the direct sum of
its minimal ideals, which are canonically determined simple Lie algebras.
The concept of semisimplicity for Lie algebras is closely related with the complete reducibility of their
representations. When the ground field F has characteristic zero, semisimplicity of a Lie algebra 0over F is
equivalent to the complete reducibility of all finite-dimensional representations of 0-An early proof of this
statement proceeded via connection with compact groups (Weyl's unitary trick), but later entirely algebraic proofs
were found.
Classification
In many ways, the classes of semisimple and solvable Lie algebras are at the opposite ends of the full spectrum of
the Lie algebras. The Levi decomposition expresses an arbitrary Lie algebra as a semidirect sum of its solvable
radical and a semisimple Lie algebra, almost in a canonical way. Semisimple Lie algebras over an algebraically
closed field have been completely classified through their root systems. The classification of solvable Lie algebras is
a 'wild' problem, and cannot be accomplished in general.
Cartan's criterion gives conditions for a Lie algebra to be nilpotent, solvable, or semisimple. It is based on the notion
of the Killing form, a symmetric bilinear form on 0 defined by the formula
K(u,v) = tr(ad(w)ad(v)),
where tr denotes the trace of a linear operator. A Lie algebra 0is semisimple if and only if the Killing form is
nondegenerate. A Lie algebra 0is solvable if and only if K(q, [0,0]) =0.
Relation to Lie groups
Although Lie algebras are often studied in their own right, historically they arose as a means to study Lie groups.
Given a Lie group, a Lie algebra can be associated to it either by endowing the tangent space to the identity with the
differential of the adjoint map, or by considering the left-invariant vector fields as mentioned in the examples. This
association is functorial, meaning that homomorphisms of Lie groups lift to homomorphisms of Lie algebras, and
various properties are satisfied by this lifting: it commutes with composition, it maps Lie subgroups, kernels,
quotients and cokernels of Lie groups to subalgebras, kernels, quotients and cokernels of Lie algebras, respectively.
The functor which takes each Lie group to its Lie algebra and each homomorphism to its differential is a faithful and
exact functor. This functor is not invertible; different Lie groups may have the same Lie algebra, for example SO(3)
Lie algebra
85
and SU(2) have isomorphic Lie algebras. Even worse, some Lie algebras need not have any associated Lie group.
Nevertheless, when the Lie algebra is finite-dimensional, there is always at least one Lie group whose Lie algebra is
the one under discussion, and a preferred Lie group can be chosen. Any finite-dimensional connected Lie group has
a universal cover. This group can be constructed as the image of the Lie algebra under the exponential map. More
generally, we have that the Lie algebra is homeomorphic to a neighborhood of the identity. But globally, if the Lie
group is compact, the exponential will not be injective, and if the Lie group is not connected, simply connected or
compact, the exponential map need not be surjective.
If the Lie algebra is infinite-dimensional, the issue is more subtle. In many instances, the exponential map is not even
locally a homeomorphism (for example, in Dif^S 1 ), one may find diffeomorphisms arbitrarily close to the identity
which are not in the image of exp). Furthermore, some infinite-dimensional Lie algebras are not the Lie algebra of
any group.
The correspondence between Lie algebras and Lie groups is used in several ways, including in the classification of
Lie groups and the related matter of the representation theory of Lie groups. Every representation of a Lie algebra
lifts uniquely to a representation of the corresponding connected, simply connected Lie group, and conversely every
representation of any Lie group induces a representation of the group's Lie algebra; the representations are in one to
one correspondence. Therefore, knowing the representations of a Lie algebra settles the question of representations
of the group. As for classification, it can be shown that any connected Lie group with a given Lie algebra is
isomorphic to the universal cover mod a discrete central subgroup. So classifying Lie groups becomes simply a
matter of counting the discrete subgroups of the center, once the classification of Lie algebras is known (solved by
Cartan et al. in the semisimple case).
Category theoretic definition
Using the language of category theory, a Lie algebra can be defined as an object A in Vec, the category of vector
spaces together with a morphism [.,.]: A ® A — > A, where ® refers to the monoidal product of Vec, such that
• [.,-]<> (id + T A , A )=0
• [■,■] o ([-,•] ® id) o (id + <7 + <7 2 ) = 0
where x(a® b) := b ® a and a is the cyclic permutation braiding (id ® x ) ° (x <E> id). In diagrammatic form:
A A A A
A A
A AAA AAA AA
A A A
Lie algebra
86
Notes
[1] Due to the anticommutativity of the commutator, the notions of a left and right ideal in a Lie algebra coincide.
[2] Humphreys p. 2
References
• Hall, Brian C. Lie Groups, Lie Algebras, and Representations: An Elementary Introduction, Springer, 2003. ISBN
0-387-40122-9
• Erdmann, Karin & Wildon, Mark. Introduction to Lie Algebras, 1st edition, Springer, 2006. ISBN 1-84628-040-0
• Humphreys, James E. Introduction to Lie Algebras and Representation Theory, Second printing, revised.
Graduate Texts in Mathematics, 9. Springer- Verlag, New York, 1978. ISBN 0-387-90053-5
• Jacobson, Nathan, Lie algebras, Republication of the 1962 original. Dover Publications, Inc., New York, 1979.
ISBN 0-486-63832-4
• Kac, Victor G. et al. Course notes for MIT 18.745: Introduction to Lie Algebras, http://www-math.mit.edu/
~lesha/7451ec/
• O'Connor, J. J. & Robertson, E.F. Biography of Sophus Lie, MacTutor History of Mathematics Archive, http://
www-history . mc s . st- and. ac . uk/B iographies/Lie . html
• O'Connor, J. J. & Robertson, E.F. Biography of Wilhelm Killing, MacTutor History of Mathematics Archive,
http :// www-history . mcs . st-and. ac . uk/B iographies/Killing . html
• Steeb, W.-H. Continuous Symmetries, Lie Algebras, Differential Equations and Computer Algebra, second
edition, World Scientific, 2007, ISBN 978-981-270-809-0
• Varadarajan, V. S. Lie Groups, Lie Algebras, and Their Representations, 1st edition, Springer, 2004. ISBN
0-387-90969-9
Lie group
In mathematics, a Lie group (pronounced /' li I/: similar to "Lee") is a group which is also a differentiable manifold,
with the property that the group operations are compatible with the smooth structure. Lie groups are named after
Sophus Lie, who laid the foundations of the theory of continuous transformation groups.
Lie groups represent the best-developed theory of continuous symmetry of mathematical objects and structures,
which makes them indispensable tools for many parts of contemporary mathematics, as well as for modern
theoretical physics. They provide a natural framework for analysing the continuous symmetries of differential
equations (Differential Galois theory), in much the same way as permutation groups are used in Galois theory for
analysing the discrete symmetries of algebraic equations. An extension of Galois theory to the case of continuous
symmetry groups was one of Lie's principal motivations.
Lie group
87
Overview
zw
w
Y
0 )l
The circle of center 0 and radius 1 in the complex
plane is a Lie group with complex multiplication.
Lie groups are smooth manifolds and, therefore, can be studied using
differential calculus, in contrast with the case of more general
topological groups. One of the key ideas in the theory of Lie groups,
from Sophus Lie, is to replace the global object, the group, with its
local or linearized version, which Lie himself called its "infinitesimal
group" and which has since become known as its Lie algebra.
Lie groups play an enormous role in modern geometry, on several
different levels. Felix Klein argued in his Erlangen program that one
can consider various "geometries" by specifying an appropriate
transformation group that leaves certain geometric properties invariant.
Thus Euclidean geometry corresponds to the choice of the group E(3)
of distance-preserving transformations of the Euclidean space R ,
conformal geometry corresponds to enlarging the group to the
conformal group, whereas in projective geometry one is interested in
the properties invariant under the projective group. This idea later led to the notion of a G-structure, where G is a Lie
group of "local" symmetries of a manifold. On a "global" level, whenever a Lie group acts on a geometric object,
such as a Riemannian or a symplectic manifold, this action provides a measure of rigidity and yields a rich algebraic
structure. The presence of continuous symmetries expressed via a Lie group action on a manifold places strong
constraints on its geometry and facilitates analysis on the manifold. Linear actions of Lie groups are especially
important, and are studied in representation theory.
In the 1940s-1950s, Ellis Kolchin, Armand Borel and Claude Chevalley realised that many foundational results
concerning Lie groups can be developed completely algebraically, giving rise to the theory of algebraic groups
defined over an arbitrary field. This insight opened new possibilities in pure algebra, by providing a uniform
construction for most finite simple groups, as well as in algebraic geometry. The theory of automorphic forms, an
important branch of modern number theory, deals extensively with analogues of Lie groups over adele rings; p-adic
Lie groups play an important role, via their connections with Galois representations in number theory.
Definitions and examples
A real Lie group is a group which is also a finite-dimensional real smooth manifold, and in which the group
operations of multiplication and inversion are smooth maps. Smoothness of the group multiplication
(i : G X G —> G fi(xj y) = xy
means that \i is a smooth mapping of the product manifold GxG into G. These two requirements can be combined to
the single requirement that the mapping
be a smooth mapping of the product manifold into G.
Lie group
88
First examples
• The 2x2 real invertible matrices form a group under multiplication, denoted by GL 2 (R):
GL 2 (R) = f^A = ^ ^ : det A = ad - be ^ 0
This is a four-dimensional noncompact real Lie group. This group is disconnected; it has two connected
components corresponding to the positive and negative values of the determinant.
• The rotation matrices form a subgroup of GL 2 (R), denoted by S0 2 (R). It is a Lie group in its own right:
specifically, a one-dimensional compact connected Lie group which is diffeomorphic to the circle. Using the
rotation angle ^ as a parameter, this group can be parametrized as follows:
S0 2 (R) = {( COS(P - sin ^W^/27rz).
v ' y ysm (f cos if J 1 J
Addition of the angles corresponds to multiplication of the elements of S0 2 (R), and taking the opposite angle
corresponds to inversion. Thus both multiplication and inversion are differentiable maps.
• The orthogonal group also forms an interesting example of a Lie group.
All of the previous examples of Lie groups fall within the class of classical groups
Related concepts
A complex Lie group is defined in the same way using complex manifolds rather than real ones (example: SL 2 (C)),
and similarly one can define a p-adic Lie group over the /?-adic numbers. Hilbert's fifth problem asked whether
replacing differentiable manifolds with topological or analytic ones can yield new examples. The answer to this
question turned out to be negative: in 1952, Gleason, Montgomery and Zippin showed that if G is a topological
manifold with continuous group operations, then there exists exactly one analytic structure on G which turns it into a
Lie group (see also Hilbert-Smith conjecture). If the underlying manifold is allowed to be infinite dimensional (for
example, a Hilbert manifold) then one arrives at the notion of an infinite-dimensional Lie group. It is possible to
define analogues of many Lie groups over finite fields, and these give most of the examples of finite simple groups.
The language of category theory provides a concise definition for Lie groups: a Lie group is a group object in the
category of smooth manifolds. This is important, because it allows generalization of the notion of a Lie group to Lie
supergroups.
More examples of Lie groups
Lie groups occur in abundance throughout mathematics and physics. Matrix groups or algebraic groups are (roughly)
groups of matrices (for example, orthogonal and symplectic groups), and these give most of the more common
examples of Lie groups.
Examples
• Euclidean space with ordinary vector addition as the group operation becomes an ^-dimensional noncompact
abelian Lie group.
• The circle group S 1 consisting of angles mod lit under addition or, alternately, the complex numbers with
absolute value 1 under multiplication. This is a one-dimensional compact connected abelian Lie group.
2
• The group GL^(R) of invertible matrices (under matrix multiplication) is a Lie group of dimension n , called the
general linear group. It has a closed connected subgroup SL^(R), the special linear group, consisting of matrices
of determinant 1 which is also a Lie group.
• The orthogonal group O^(R), consisting of d\\nxn orthogonal matrices with real entries is an n(n -
l)/2-dimensional Lie group. This group is disconnected, but it has a connected subgroup SO (R) of the same
Lie group
89
dimension consisting of orthogonal matrices of determinant 1, called the special orthogonal group (for n = 3, the
rotation group).
• The Euclidean group E^(R) is the Lie group of all Euclidean motions, i.e., isometric affine maps, of
^-dimensional Euclidean space R".
• The unitary group U(n) consisting of n x n unitary matrices (with complex entries) is a compact connected Lie
2 2
group of dimension n . Unitary matrices of determinant 1 form a closed connected subgroup of dimension n - 1
denoted S\J(n), the special unitary group.
• Spin groups are double covers of the special orthogonal groups, used for studying fermions in quantum field
theory (among other things).
• The symplectic group Sp (R) consists of all 2n x 2n matrices preserving a nondegenerate skew- symmetric
In 2
bilinear form on R (the symplectic form). It is a connected Lie group of dimension 2n + n. The fundamental
group of the symplectic group is Z and this fact is related to the theory of Maslov index.
• The 3 -sphere S forms a Lie group by identification with the set of quaternions of unit norm, called versors. The
only other spheres that admit the structure of a Lie group are the 0-sphere S° (real numbers with absolute value 1)
and the circle S 1 (complex numbers with absolute value 1). For example, for even n > 1, S n is not a Lie group
because it does not admit a nonvanishing vector field and so a fortiori cannot be parallelizable as a differentiable
0 13 7
manifold. Of the spheres only S , S , S , and S are parallelizable. The latter carries the structure of a Lie
quasigroup (a nonassociative group), which can be identified with the set of unit octonions.
• The group of upper triangular n by n matrices is a solvable Lie group of dimension nin + l)/2.
• The Lorentz group and the Poincare group are the groups of linear and affine isometries of the Minkowski space
(interpreted as the spacetime of the special relativity). They are Lie groups of dimensions 6 and 10.
• The Heisenberg group is a connected nilpotent Lie group of dimension 3, playing a key role in quantum
mechanics.
• The group U(l)xSU(2)xSU(3) is a Lie group of dimension 1+3+8=12 that is the gauge group of the Standard
Model in particle physics. The dimensions of the factors correspond to the 1 photon + 3 vector bosons + 8 gluons
of the standard model.
• The (3 -dimensional) metaplectic group is a double cover of SL 2 (R) playing an important role in the theory of
modular forms. It is a connected Lie group that cannot be faithfully represented by matrices of finite size, i.e., a
nonlinear group.
• The exceptional Lie groups of types F 4? E , E , E^ have dimensions 14, 52, 78, 133, and 248. There is also a
group E 71/2 of dimension 190.
Constructions
There are several standard ways to form new Lie groups from old ones:
• The product of two Lie groups is a Lie group.
• Any topologically closed subgroup of a Lie group is a Lie group. This is known as Cartan's theorem.
• The quotient of a Lie group by a closed normal subgroup is a Lie group.
• The universal cover of a connected Lie group is a Lie group. For example, the group R is the universal cover of
the circle group S 1 . In fact any covering of a differentiable manifold is also a differentiable manifold, but by
specifying universal cover, one guarantees a group structure (compatible with its other structures).
Lie group
90
Related notions
Some examples of groups that are not Lie groups (except in the trivial sense that any group can be viewed as a
0-dimensional Lie group, with the discrete topology), are:
• Infinite dimensional groups, such as the additive group of an infinite dimensional real vector space. These are not
Lie groups as they are not finite dimensional manifolds
• Some totally disconnected groups, such as the Galois group of an infinite extension of fields, or the additive group
of the /?-adic numbers. These are not Lie groups because their underlying spaces are not real manifolds. (Some of
these groups are "/?-adic Lie groups"). In general, only topological groups having similar local properties to R n for
some positive integer n can be Lie groups (of course they must also have a differentiable structure)
Early history
According to the most authoritative source on the early history of Lie groups (Hawkins, p. 1), Sophus Lie himself
considered the winter of 1873-1874 as the birth date of his theory of continuous groups. Hawkins, however,
suggests that it was "Lie's prodigious research activity during the four-year period from the fall of 1869 to the fall of
1873" that led to the theory's creation (ibid). Some of Lie's early ideas were developed in close collaboration with
Felix Klein. Lie met with Klein every day from October 1869 through 1872: in Berlin from the end of October 1869
to the end of February 1870, and in Paris, Gottingen and Erlangen in the subsequent two years (ibid, p. 2). Lie stated
that all of the principal results were obtained by 1884. But during the 1870s all his papers (except the very first note)
were published in Norwegian journals, which impeded recognition of the work throughout the rest of Europe (ibid,
p. 76). In 1884 a young German mathematician, Friedrich Engel, came to work with Lie on a systematic treatise to
expose his theory of continuous groups. From this effort resulted the three- volume Theorie der
Transformations gruppen, published in 1888, 1890, and 1893.
Lie's ideas did not stand in isolation from the rest of mathematics. In fact, his interest in the geometry of differential
equations was first motivated by the work of Carl Gustav Jacobi, on the theory of partial differential equations of
first order and on the equations of classical mechanics. Much of Jacobi's work was published posthumously in the
1860s, generating enormous interest in France and Germany (Hawkins, p. 43). Lie's idee fixe was to develop a theory
of symmetries of differential equations that would accomplish for them what Evariste Galois had done for algebraic
equations: namely, to classify them in terms of group theory. Lie and other mathematicians showed that the most
important equations for special functions and orthogonal polynomials tend to arise from group theoretical
symmetries. Additional impetus to consider continuous groups came from ideas of Bernhard Riemann, on the
foundations of geometry, and their further development in the hands of Klein. Thus three major themes in 19th
century mathematics were combined by Lie in creating his new theory: the idea of symmetry, as exemplified by
Galois through the algebraic notion of a group; geometric theory and the explicit solutions of differential equations
of mechanics, worked out by Poisson and Jacobi; and the new understanding of geometry that emerged in the works
of Pliicker, Mobius, Grassmann and others, and culminated in Riemann's revolutionary vision of the subject.
Although today Sophus Lie is rightfully recognized as the creator of the theory of continuous groups, a major stride
in the development of their structure theory, which was to have a profound influence on subsequent development of
mathematics, was made by Wilhelm Killing, who in 1888 published the first paper in a series entitled Die
Zusammensetzung der stetigen endlichen Transformations gruppen (The composition of continuous finite
transformation groups) (Hawkins, p. 100). The work of Killing, later refined and generalized by Elie Cartan, led to
classification of semisimple Lie algebras, Cartan's theory of symmetric spaces, and Hermann Weyl's description of
representations of compact and semisimple Lie groups using highest weights.
Weyl brought the early period of the development of the theory of Lie groups to fruition, for not only did he classify
irreducible representations of semisimple Lie groups and connect the theory of groups with quantum mechanics, but
he also put Lie's theory itself on firmer footing by clearly enunciating the distinction between Lie's infinitesimal
Lie group
91
groups (i.e., Lie algebras) and the Lie groups proper, and began investigations of topology of Lie groups (Borel
(2001), ). The theory of Lie groups was systematically reworked in modern mathematical language in a monograph
by Claude Che valley.
The concept of a Lie group, and possibilities of classification
Lie groups may be thought of as smoothly varying families of symmetries. Examples of symmetries include rotation
about an axis. What must be understood is the nature of 'small' transformations, e.g., rotations through tiny angles,
that link nearby transformations. The mathematical object capturing this structure is called a Lie algebra (Lie himself
called them "infinitesimal groups"). It can be defined because Lie groups are manifolds, so have tangent spaces at
each point.
The Lie algebra of any compact Lie group (very roughly: one for which the symmetries form a bounded set) can be
decomposed as a direct sum of an abelian Lie algebra and some number of simple ones. The structure of an abelian
Lie algebra is mathematically uninteresting (since the Lie bracket is identically zero); the interest is in the simple
summands. Hence the question arises: what are the simple Lie algebras of compact groups? It turns out that they
mostly fall into four infinite families, the "classical Lie algebras" A , B , C and D , which have simple descriptions
J ° n n n n
in terms of symmetries of Euclidean space. But there are also just five "exceptional Lie algebras" that do not fall into
any of these families. E_ is the largest of these.
Properties
• The diffeomorphism group of a Lie group acts transitively on the Lie group
• Every Lie group is parallelizable, and hence an orientable manifold (there is a bundle isomorphism between its
tangent bundle and the product of itself with the tangent space at the identity)
Types of Lie groups and structure theory
Lie groups are classified according to their algebraic properties (simple, semisimple, solvable, nilpotent, abelian),
their connectedness (connected or simply connected) and their compactness.
• Compact Lie groups are all known: they are finite central quotients of a product of copies of the circle group S l
and simple compact Lie groups (which correspond to connected Dynkin diagrams).
• Any simply connected solvable Lie group is isomorphic to a closed subgroup of the group of invertible upper
triangular matrices of some rank, and any finite dimensional irreducible representation of such a group is 1
dimensional. Solvable groups are too messy to classify except in a few small dimensions.
• Any simply connected nilpotent Lie group is isomorphic to a closed subgroup of the group of invertible upper
triangular matrices with l's on the diagonal of some rank, and any finite dimensional irreducible representation of
such a group is 1 dimensional. Like solvable groups, nilpotent groups are too messy to classify except in a few
small dimensions.
• Simple Lie groups are sometimes defined to be those that are simple as abstract groups, and sometimes defined to
be connected Lie groups with a simple Lie algebra. For example, SL 2 (R) is simple according to the second
definition but not according to the first. They have all been classified (for either definition).
• Semisimple Lie groups are Lie groups whose Lie algebra is a product of simple Lie algebras They are central
extensions of products of simple Lie groups.
The identity component of any Lie group is an open normal subgroup, and the quotient group is a discrete group.
The universal cover of any connected Lie group is a simply connected Lie group, and conversely any connected Lie
group is a quotient of a simply connected Lie group by a discrete normal subgroup of the center. Any Lie group G
can be decomposed into discrete, simple, and abelian groups in a canonical way as follows. Write
G for the connected component of the identity
con J
Lie group
92
G sol for the largest connected normal solvable subgroup
G nil for the largest connected normal nilpotent subgroup
so that we have a sequence of normal subgroups
1 C G C G , C G C G.
ml sol con
Then
GIG is discrete
con
G con /G sol is a central extension of a product of simple connected Lie groups.
C^so/G^ is abelian. A connected abelian Lie group is isomorphic to a product of copies of R and the circle
group S .
G nil /1 is nilpotent, and therefore its ascending central series has all quotients abelian.
This can be used to reduce some problems about Lie groups (such as finding their unitary representations) to the
same problems for connected simple groups and nilpotent and solvable subgroups of smaller dimension.
The Lie algebra associated with a Lie group
To every Lie group, we can associate a Lie algebra, whose underlying vector space is the tangent space of G at the
identity element, which completely captures the local structure of the group. Informally we can think of elements of
the Lie algebra as elements of the group that are "infinitesimally close" to the identity, and the Lie bracket is
something to do with the commutator of two such infinitesimal elements. Before giving the abstract definition we
give a few examples:
• The Lie algebra of the vector space is just R" with the Lie bracket given by
[A, B] = 0.
(In general the Lie bracket of a connected Lie group is always 0 if and only if the Lie group is abelian.)
• The Lie algebra of the general linear group GL^(R) of invertible matrices is the vector space MJJl) of square
matrices with the Lie bracket given by
[A, B] - AB - BA.
If G is a closed subgroup of GL (R) then the Lie algebra of G can be thought of informally as the matrices m of
n 2
M (R) such that 1 + em is in G, where 8 is an infinitesimal positive number with 8=0 (of course, no such real
n T
number 8 exists). For example, the orthogonal group O (R) consists of matrices A with AA = 1, so the Lie algebra
T n T 2
consists of the matrices m with (1 + 8m)(l + 8m) = 1, which is equivalent to m + m =0 because 8=0.
• Formally, when working over the reals, as here, this is accomplished by considering the limit as 8 — > 0; but the
"infinitesimal" language generalizes directly to Lie groups over general rings.
The concrete definition given above is easy to work with, but has some minor problems: to use it we first need to
represent a Lie group as a group of matrices, but not all Lie groups can be represented in this way, and it is not
obvious that the Lie algebra is independent of the representation we use. To get round these problems we give the
general definition of the Lie algebra of any Lie group (in 4 steps):
1 . Vector fields on any smooth manifold M can be thought of as derivations X of the ring of smooth functions on the
manifold, and therefore form a Lie algebra under the Lie bracket [X, Y] = XY - YX, because the Lie bracket of any
two derivations is a derivation.
2. If G is any group acting smoothly on the manifold M, then it acts on the vector fields, and the vector space of
vector fields fixed by the group is closed under the Lie bracket and therefore also forms a Lie algebra.
3. We apply this construction to the case when the manifold M is the underlying space of a Lie group G, with G
acting on G = M by left translations L (h) = gh. This shows that the space of left invariant vector fields (vector
fields satisfying L X = X for every h in G, where L # denotes the differential of L ) on a Lie group is a Lie
8 ft 8ft 8 8
Lie group
93
algebra under the Lie bracket of vector fields.
4. Any tangent vector at the identity of a Lie group can be extended to a left invariant vector field by left translating
the tangent vector to other points of the manifold. Specifically, the left invariant extension of an element v of the
tangent space at the identity is the vector field defined by v A =L t y. This identifies the tangent space T at the
8 8 e
identity with the space of left invariant vector fields, and therefore makes the tangent space at the identity into a
Lie algebra, called the Lie algebra of G, usually denoted by a Fraktur 0. Thus the Lie bracket on 0is given
explicitly by [v, w] = [v A , w A ]^.
This Lie algebra 0is finite-dimensional and it has the same dimension as the manifold G. The Lie algebra of G
determines G up to "local isomorphism", where two Lie groups are called locally isomorphic if they look the same
near the identity element. Problems about Lie groups are often solved by first solving the corresponding problem for
the Lie algebras, and the result for groups then usually follows easily. For example, simple Lie groups are usually
classified by first classifying the corresponding Lie algebras.
We could also define a Lie algebra structure on 7^ using right invariant vector fields instead of left invariant vector
fields. This leads to the same Lie algebra, because the inverse map on G can be used to identify left invariant vector
fields with right invariant vector fields, and acts as -1 on the tangent space 7\
The Lie algebra structure on T can also be described as follows: the commutator operation
(x, y) — > xyx y
on G x G sends (e, e) to e, so its derivative yields a bilinear operation on TG. This bilinear operation is actually the
zero map, but the second derivative, under the proper identification of tangent spaces, yields an operation that
satisfies the axioms of a Lie bracket, and it is equal to twice the one defined through left-invariant vector fields.
Homomorphisms and isomorphisms
If G and H are Lie groups, then a Lie-group homomorphism / : G — » H is a smooth group homomorphism. (It is
equivalent to require only that /be continuous rather than smooth.) The composition of two such homomorphisms is
again a homomorphism, and the class of all Lie groups, together with these morphisms, forms a category. Two Lie
groups are called isomorphic if there exists a bijective homomorphism between them whose inverse is also a
homomorphism. Isomorphic Lie groups are essentially the same; they only differ in the notation for their elements.
Every homomorphism / : G — » H of Lie groups induces a homomorphism between the corresponding Lie algebras 0
and f) . The association G > 0is a functor (mapping between categories satisfying certain axioms).
One version of Ado's theorem is that every finite dimensional Lie algebra is isomorphic to a matrix Lie algebra. For
every finite dimensional matrix Lie algebra, there is a linear group (matrix Lie group) with this algebra as its Lie
algebra. So every abstract Lie algebra is the Lie algebra of some (linear) Lie group.
The global structure of a Lie group is not determined by its Lie algebra; for example, if Z is any discrete subgroup of
the center of G then G and G/Z have the same Lie algebra (see the table of Lie groups for examples). A connected
Lie group is simple, semisimple, solvable, nilpotent, or abelian if and only if its Lie algebra has the corresponding
property.
If we require that the Lie group be simply connected, then the global structure is determined by its Lie algebra: for
every finite dimensional Lie algebra 0over F there is a simply connected Lie group G with 0as Lie algebra, unique
up to isomorphism. Moreover every homomorphism between Lie algebras lifts to a unique homomorphism between
the corresponding simply connected Lie groups.
Lie group
94
The exponential map
The exponential map from the Lie algebra M^(R) of the general linear group GL^(R) to GL^(R) is defined by the
usual power series:
A 2 A 3
exp(A) = l + A + — + — + •••
for matrices A. If G is any subgroup of GL^(R), then the exponential map takes the Lie algebra of G into G, so we
have an exponential map for all matrix groups.
The definition above is easy to use, but it is not defined for Lie groups that are not matrix groups, and it is not clear
that the exponential map of a Lie group does not depend on its representation as a matrix group. We can solve both
problems using a more abstract definition of the exponential map that works for all Lie groups, as follows.
Every vector v in g determines a linear map from R to 0 taking 1 to v, which can be thought of as a Lie algebra
homomorphism. Because R is the Lie algebra of the simply connected Lie group R, this induces a Lie group
homomorphism c : R — » G so that
c(s + t) = c(s)c(t)
for all s and t. The operation on the right hand side is the group multiplication in G. The formal similarity of this
formula with the one valid for the exponential function justifies the definition
exp(f) = c(l).
This is called the exponential map, and it maps the Lie algebra 0into the Lie group G. It provides a diffeomorphism
between a neighborhood of 0 in 0and a neighborhood of e in G. This exponential map is a generalization of the
exponential function for real numbers (because R is the Lie algebra of the Lie group of positive real numbers with
multiplication), for complex numbers (because C is the Lie algebra of the Lie group of non-zero complex numbers
with multiplication) and for matrices (because M^(R) with the regular commutator is the Lie algebra of the Lie group
GL^(R) of all invertible matrices).
Because the exponential map is surjective on some neighbourhood N of e, it is common to call elements of the Lie
algebra infinitesimal generators of the group G. The subgroup of G generated by N is the identity component of G.
The exponential map and the Lie algebra determine the local group structure of every connected Lie group, because
of the Baker-Campbell-Hausdorff formula: there exists a neighborhood U of the zero element of 0, such that for u,
v in U we have
exp(w) exp(v) = exp(w + v + 1/2 [u, v] + 1/12 [[u, v], v] - 1/12 [[u, v], u] - ...)
where the omitted terms are known and involve Lie brackets of four or more elements. In case u and v commute, this
formula reduces to the familiar exponential law exp(w) exp(v) = exp(w + v).
The exponential map from the Lie algebra to the Lie group is not always onto, even if the group is connected (though
it does map onto the Lie group for connected groups that are either compact or nilpotent). For example, the
exponential map of SL 2 (R) is not surjective.
Infinite dimensional Lie groups
Lie groups are finite dimensional by definition, but there are many groups that resemble Lie groups, except for being
infinite dimensional. There is very little "general theory" of such groups, but some of the examples that have been
studied include:
• The group of diffeomorphisms of a manifold. Quite a lot is known about the group of diffeomorphisms of the
circle. Its Lie algebra is (more or less) the Witt algebra, which has a central extension called the Virasoro algebra,
used in string theory and conformal field theory. Very little is known about the diffeomorphism groups of
manifolds of larger dimension. The diffeomorphism group of spacetime sometimes appears in attempts to
quantize gravity.
Lie group
95
• The group of smooth maps from a manifold to a finite dimensional Lie group is called a gauge group (with
operation of pointwise multiplication), and is used in quantum field theory and Donaldson theory. If the manifold
is a circle these are called loop groups, and have central extensions whose Lie algebras are (more or less)
Kac-Moody algebras.
• There are infinite dimensional analogues of general linear groups, orthogonal groups, and so on. One important
aspect is that these may have simpler topological properties: see for example Kuiper's theorem.
• Just as calculus in finite-dimensional real vector spaces can be extended to calculus in Banach spaces, the
definition of finite-dimensional smooth manifolds can be extended to give a definition of Banach analytic
manifolds. Similarly, the standard finite-dimensional definition of Lie groups can be extended to give a definition
of Banach analytic Lie groups. In this case, we have a Banach analytic manifold which simultaneously has a
group structure such that multiplication and inversion are analytic maps. Some of the theorems of
finite-dimensional Lie groups do not carry over to the Banach analytic case, and in particular the relation between
Lie groups and Lie algebras is much more subtle in the infinite dimensional case. However, it is true that "for
infinite dimensional Lie groups modeled on Banach spaces there is a well-developed theory ... which is closely
parallel to the theory of finite dimensional Lie groups.'
Notes
[1] Sigurdur Helgason, "Differential Geometry, Lie Groups, and Symmetric Spaces", Academic Press, 1978, page 131.
[2] Andrew Pressley and Graeme Segal, Loop Groups, Oxford Science Publications, 1986, page 26.
References
• Adams, John Frank (1969), Lectures on Lie Groups, Chicago Lectures in Mathematics, Chicago: Univ. of
Chicago Press, ISBN 0-226-00527-5.
• Borel, Armand (2001), "Essays in the history of Lie groups and algebraic groups", History of Mathematics
(American Mathematical Society) 21, ISBN 0-8218-0288-7.
• Bourbaki, Nicolas, Elements of mathematics: Lie groups and Lie algebras. Chapters 1-3 ISBN 3-540-64242-0,
Chapters 4-6 ISBN 3-540-42650-7, Chapters 7-9 ISBN 3-540-43405-4
• Chevalley, Claude (1946), Theory of Lie groups, Princeton: Princeton University Press, ISBN 0-691-04990-4.
• Fulton, William; Harris, Joe (1991), Representation theory. A first course, Graduate Texts in Mathematics,
Readings in Mathematics, 129, New York: Springer- Verlag, MR1 153249, ISBN 978-0-387-97527-6,
ISBN 978-0-387-97495-8
• Hall, Brian C. (2003), Lie Groups, Lie Algebras, and Representations: An Elementary Introduction, Springer,
ISBN 0-387-40122-9.
• Hawkins, Thomas (2000), Emergence of the theory of Lie groups, Springer, ISBN 0-387-98963-3
• Knapp, Anthony W. (2002), Lie Groups Beyond an Introduction, Progress in Mathematics, 140 (2nd ed.), Boston:
Birkhauser, ISBN 0-8176-4259-5.
• Rossmann, Wulf (2001), Lie Groups: An Introduction Through Linear Groups, Oxford Graduate Texts in
Mathematics, Oxford University Press, ISBN 978-0198596837. The 2003 reprint corrects several typographical
mistakes.
• Serre, Jean-Pierre (1965), Lie Algebras and Lie Groups: 1964 Lectures given at Harvard University, Lecture
notes in mathematics, 1500, Springer, ISBN 3-540-55008-9.
• Steeb, Willi-Hans (2007), Continuous Symmetries, Lie algebras, Differential Equations and Computer Algebra:
second edition, World Scientific Publishing, ISBN 981-270-809-X.
Hopf algebra
96
Hopf algebra
In mathematics, a Hopf algebra, named after Heinz Hopf, is a structure that is simultaneously a (unital associative)
algebra, a (counital coassociative) coalgebra, with these structures compatible making it a bialgebra, and moreover is
equipped with an antiautomorphism satisfying a certain property.
Hopf algebras occur naturally in algebraic topology, where they originated and are related to the H-space concept, in
group scheme theory, in group theory (via the concept of a group ring), and in numerous other places, making them
probably the most familiar type of bialgebra. Hopf algebras are also studied in their own right, with much work on
specific classes of examples on the one hand and classification problems on the other.
Formal definition
Formally, a Hopf algebra is a bialgebra H over a field K together with a TT-linear map S\ H
antipode) such that the following diagram commutes:
S(g>id
H (called the
id<g>S
Here A is the comultiplication of the bialgebra, V its multiplication, n its unit and 8 its counit. In the sumless
Sweedler notation, this property can also be expressed as
S r (c( 1 ))c( 2 ) = C(!)5(c( 2 )) = e(c)l for all c G H.
As for algebras, one can replace the underlying field K with a commutative ring R in the above definition.
The definition of Hopf algebra is self-dual (as reflected in the symmetry of the above diagram), so if one can define a
dual of H (which is always possible if H is finite-dimensional), then it is automatically a Hopf algebra.
Properties of the antipode
The antipode S is sometimes required to have a 7^-linear inverse, which is automatic in the finite-dimensional case, or
if H is commutative or cocommutative (or more generally quasitriangular).
In general, S is an antihomomorphism,^ so S^is a homomorphism, which is therefore an automorphism if S was
invertible (as may be required).
If g 2 — Jd , then the Hopf algebra is said to be involutive (and the underlying algebra with involution is a
*-algebra). If H is finite-dimensional semisimple over a field of characteristic zero, commutative, or cocommutative,
then it is involutive.
If a bialgebra B admits an antipode S, then S is unique ("a bialgebra admits at most 1 Hopf algebra structure").
The antipode is an analog to the inversion map on a group that sends 9 to g~ 1 ^
Hopf algebra
97
Hopf subalgebras
A subalgebra K (not to be confused with the Field K in the notation above) of a Hopf algebra H is a Hopf subalgebra
if it is a subcoalgebra of H and the antipode S maps K into K. In other words, a Hopf subalgebra K is a Hopf algebra
in its own right when the multiplication, comultiplication, counit and antipode of H is restricted to K (and
additionally the identity 1 is required to be in K). The Nichols-Zoeller Freeness theorem established (in 1989) that
either natural K-module H is free of finite rank if H is finite dimensional: a generalization of Lagrange's theorem for
subgroups. As a corollary of this and integral theory, a Hopf subalgebra of a semisimple finite dimensional Hopf
algebra is automatically semisimple.
A Hopf subalgebra K is said to be right normal in a Hopf algebra H if it satisfies the condition of stability,
ad r (h)(K) C K for all h in H, where the right adjoint mapping ad T is defined by
ad T (h)(k) = iS , (/i( 1 ))fc/i(2)f° r a U k in K, h in H. Similarly, a Hopf subalgebra K is left normal in H if it is stable
under the left adjoint mapping defined by ad£(h)(k) = h^kS(h^2)) • The two conditions of normality are
equivalent if the antipode S is bijective, in which case K is said to be a normal Hopf subalgebra.
A normal Hopf subalgebra K in H satisfies the condition (of equality of subsets of H): = K + H wnere
denotes the kernel of the counit on K. This normality condition implies that HK^~ 1 ^ a Hopf ideal of H (i.e. an
algebra ideal in the kernel of the counit, a coalgebra coideal and stable under the antipode). As a consequence one
has a quotient Hopf algebra H/HK^~znd epimorphism H —> H/K^~H , a theory analogous to that of normal
Mi
subgroups and quotient groups in group theory.
Examples
Group algebra. Suppose G is a group. The group algebra KG is a unital associative algebra over K. It turns into a
Hopf algebra if we define
• A : KG —> KG ® KG by A(g) = g ® g for all g in G
• 8 : KG Kby e(g) = 1 for all g in G
• S : KG — » KG by S(g) = g~ l for all g in G.
Functions on a finite group. Suppose now that G is a finite group. Then the set of all functions from G to K with
pointwise addition and multiplication is a unital associative algebra over K, and is naturally isomorphic to
(for G infinite, is a proper subset of K° xG ). The set K° becomes a Hopf algebra if we define
• A : K° -> K° xG by A(f)(x,y) =f(xy) for all /in K° and all x,y in G
—> Kby e(f) =f(e) for every /in [here e is the identity element of G]
• S : K° -> K° by S(f)(x) =f(x~ l ) for all/in K° and all x in G.
Note that functions on a finite group can be identified with the group ring, though these are more naturally thought of
as dual - the group ring consists of finite sums of elements, and thus pairs with functions on the group by evaluating
the function on the summed elements.
Regular functions on an algebraic group. Generalizing the previous example, we can use the same formulas to
show that for a given algebraic group G over K, the set of all regular functions on G forms a Hopf algebra.
Universal enveloping algebra. Suppose g is a Lie algebra over the field K and U is its universal enveloping algebra.
U becomes a Hopf algebra if we define
• A : U — » U® Uby A(x) = x®l + l®xfor every x in g (this rule is compatible with commutators and can
therefore be uniquely extended to all of U).
• 8 : U — » K by z(x) = 0 for all x in g (again, extended to U)
• S : U —> U by S(x) = -x for all x in g.
Hopf algebra
98
Cohomology of Lie groups
The cohomology algebra of a Lie group is a Hopf algebra: the multiplication is provided by the cup-product, and the
comultiplication
H*{G) -> H*(G xG) = H*(G) g> H*(G)
by the group multiplication G X G — > G • This observation was actually a source of the notion of Hopf algebra.
Using this structure, Hopf proved a structure theorem for the cohomology algebra of Lie groups.
Theorem (Hopf) 1 ^ Let A be a finite-dimensional, graded commutative, graded cocommutative Hopf algebra over a
field of characteristic 0. Then A (as an algebra) is a free exterior algebra with generators of odd degree.
Quantum groups and non-commutative geometry
All examples above are either commutative (i.e. the multiplication is commutative) or co-commutative (i.e. A = J 1 o
A where T: H ® H — » H ® H is defined by T(x ® y) = y ® x). Other interesting Hopf algebras are certain
"deformations" or "quantizations" of those from example 3 which are neither commutative nor co-commutative.
These Hopf algebras are often called quantum groups, a term that is so far only loosely defined. They are important
in noncommutative geometry, the idea being the following: a standard algebraic group is well described by its
standard Hopf algebra of regular functions; we can then think of the deformed version of this Hopf algebra as
describing a certain "non-standard" or "quantized" algebraic group (which is not an algebraic group at all). While
there does not seem to be a direct way to define or manipulate these non-standard objects, one can still work with
their Hopf algebras, and indeed one identifies them with their Hopf algebras. Hence the name "quantum group".
Related concepts
Graded Hopf algebras are often used in algebraic topology: they are the natural algebraic structure on the direct sum
of all homology or cohomology groups of an H-space.
Locally compact quantum groups generalize Hopf algebras and carry a topology. The algebra of all continuous
functions on a Lie group is a locally compact quantum group.
Quasi-Hopf algebras are generalizations of Hopf algebras, where coassociativity only holds up to a twist.
Weak Hopf algebras, or quantum groupoids, are generalizations of Hopf algebras. Like Hopf algebras, weak Hopf
algebras form a self-dual class of algebras; i.e., if H is a (weak) Hopf algebra, so is JJ* , the dual space of linear
forms on H (with respect to the algebra-coalgebra structure obtained from the natural pairing with H and its
coalgebra- algebra structure).
A weak Hopf algebra H is usually taken to be a 1) finite dimensional algebra and coalgebra with coproduct
A : H — > H ® H and counit e : H ^ k satisfying all the axioms of Hopf algebra except possibly
A(l) 7^ 1 ® lor e(afe) ^ e(a)e(b)for some a,b in H. Instead one requires that
(A(l) ® 1)(1 ® A(l)) = (A(l) ® 1)(1 ® A(l)) = (A <8> Id)A(l)and
e[abc) = y^e(ab(i))e(b( 2 ) c) = e(ab( 2 ))e(b( 1 )c) for all a,b, and c in H.
2) H has a weakened antipode S : H — » H satisfying the axioms (a) - (c): (a) S{a^)a^) = l(i)e(al(2))f° r
all a in H (the right-hand side is the interesting projection usually denoted by 11^ (a) or e s (a)with image a
separable subalgebra denoted by JJ R or H s ); (b) a^S{a^) = f(l(i)a)l(2)for all a in H (another interesting
projection usually denoted by II jC/ (a)or ^ (a) with image a separable algebra jj L ov i/ t , anti-isomorphic to
jj L Vm S); and (c) S{a^i^a^2)S{a^ = S{a)for all a in H. Note that if A(l) = 1 ® 1, these conditions
^Sii^^W^attl^t^^WIlM ftg mitig^f ef «tfggt#te^% rigid tensor category. The unit H-module is the
separable algebra ff L mentioned above.
Hopf algebra
99
For example, a finite groupoid algebra is a weak Hopf algebra. In particular, the groupoid algebra on [n] with one
pair of invertible arrows &ij and ^between i and j in [n] is isomorphic to the algebra H of n x n matrices. The
weak Hopf algebra structure on this particular H is given by coproduct A(e^j) = e^- ® , counit 6(e^*) = 1
and antipode S{e.ij) = Cji . The separable subalgebras jj L znd ff R coincide and are non-central commutative
algebras in this particular case (the subalgebra of diagonal matrices).
Early theoretical contributions to weak Hopf algebras are to be found in ^ as well as ^
Hopf algebroids introduced by J.-H. Lu in 1996 as a result on work on groupoids in Poisson geometry (later shown
equivalent in nontrivial way to a construction of Takeuchi from the 1970s and another by Xu around the year 2000):
Hopf algebroids generalize weak Hopf algebras and certain skew Hopf algebras. They may be loosely thought of as
Hopf algebras over a noncommutative base ring, where weak Hopf algebras become Hopf algebras over a separable
algebra. It is a theorem that a Hopf algebroid satisfying a finite projectivity condition over a separable algebra is a
weak Hopf algebra, and conversely a weak Hopf algebra H is a Hopf algebroid over its separable subalgebra ff L .
The antipode axioms have been changed by G. Bohm and K. Szlachanyi (J. Algebra) in 2004 for tensor categorical
reasons and to accommodate examples associated to depth two Frobenius algebra extensions.
A left Hopf algebroid (H,R) is a left bialgebroid together with an antipode: the bialgebroid (H,R) consists of a total
algebra H and a base algebra R and two mappings, an algebra homomorphism s : R — > H called a source map, an
algebra anti-homomorphism t : R — > H called a target map, such that the commutativity condition
s(ri)t(r2) = t(r2)s(ri)is satisfied for all /"i, € R • The axioms resemble those of a Hopf algebra but are
complicated by the possibility that R is a noncommutative algebra or its images under s and t are not in the center of
H. In particular a left bialgebroid (H,R) has an R-R-bimodule structure on H which prefers the left side as follows:
7*1 • h • T2 = s(ri)t{r2)h for all h in H, 7*1 5 7*2 G R • There is a coproduct A : H — > H H and counit
e ; H —> R mat make (i?, i?, A, e)an R-coring (with axioms like that of a coalgebra such that all mappings are
R-R-bimodule homomorphisms and all tensors over R). Additionally the bialgebroid (H,R) must satisfy
A (aft) = A (a) A (6) for all a,b in H, and a condition to make sure this last condition makes sense: every image
point A(a) satisfies a^t(r) (g) a^ 2 ) = ® a^)s(r) for all r in R. Also A(l) = 1 ® 1. The counit is
W k ^o^ is S fH^ll ^iuW 0 ^^^)^ ftek(^aFto^^)ktisfying conditions of
exchanging the source and target maps and satisfying two axioms like Hopf algebra antipode axioms; see the
references in Lu or in Bohm- Szlachanyi for a more example-category friendly, though somewhat more complicated,
set of axioms for the antipode S. The latter set of axioms depend on the axioms of a right bialgebroid as well, which
are a straightforward switching of left to right, s with t, of the axioms for a left bialgebroid given above.
As an example of left bialgebroid, take R to be any algebra over a field k. Let H be its algebra of linear
self-mappings. Let s(r) be left multication by r on R; let t(r) be right multiplication by r on R. H is a left bialgebroid
over R, which may be seen as follows. From the fact that H ®ij H = Honife(i2 ® i? 5 i?)one may define a
coproduct by A(/)(r® u) — f(ru) for each linear transformation f from R to itself and all r,u in R.
Coassociativity of the coproduct follows from associativity of the product on R. A counit is given by = f(l)
. The counit axioms of a coring follow from the identity element condition on multiplication in R. The reader will be
amused, or at least edified, to check that (H,R) is a left bialgebroid. In case R is an Azumaya algebra, in which case
H is isomorphic to R tensor R, an antipode comes from transposing tensors, which makes H a Hopf algebroid over R.
Multiplier Hopf algebras introduced by Alfons Van Daele in 1994^ are generalizations of Hopf algebras where
comultiplication from an algebra (with or withthout unit) to the multiplier algebra of tensor product algebra of the
algebra with itself.
Hopf group- (co)algebras introduced by V.G.Turaev in 2000 are also generalizations of Hopf algebras.
Hopf algebra
100
Analogy with groups
Groups can be axiomatized by the same diagrams (equivalently, operations) as a Hopf algebra, where G is taken to
be a set instead of a module. In this case:
• the field K is replaced by the 1 -point set
• there is a natural counit (map to 1 point)
• there is a natural comultiplication (the diagonal map)
• the unit is the identity element of the group
• the multiplication is the multiplication in the group
• the antipode is the inverse
rm
In this philosophy, a group can be thought of as a Hopf algebra over the "field with one element".
See also
• Quasitriangular Hopf algebra
• Algebra/set analogy
• Representation theory of Hopf algebras
• Ribbon Hopf algebra
• Superalgebra
• Supergroup
• Anyonic Lie algebra
Notes
[1] Dascalescu, Nastasescu & Raianu (2001), Prop. 4.2.6, p. 153 (http://books.google.com/books ?id=pBJ6sbPHA0IC&pg=PA153&dq="is+
an+ antimorphism+ of + algebras " )
[2] Dascalescu, Nastasescu & Raianu (2001), Remarks 4.2.3, p. 151 (http://books.google.com/books ?id=pBJ6sbPHA0IC&pg=PA151&
dq="the+antipode+is+unique")
[3] Quantum groups lecture notes (http://www.mathematik.uni-muenchen.de/~pareigis/Vorlesungen/QuantGrp/ln2_l.pdf)
[4] S. Montgomery, Hopf algebras and their actions on rings, Conf. Board in Math. Sci. vol. 82, A.M.S., 1993. ISBN 0-8218-0738-2
[5] Hopf, 1941.
[6] Gabriella Bohm, Florian Nill, Kornel Szlachanyi. J. Algebra 221 (1999), 385-438
[7] Dmitri Nikshych, Leonid Vainerman, in: New direction in Hopf algebras, S. Montgomery and H.-J. Schneider, eds., M.S.R.I. Publications,
vol. 43, Cambridge, 2002, 211-262.
[8] Alfons Van Daele. Multiplier Hopf algebras (http://www.ams.org/tran/1994-342-02/S0002-9947-1994-1220906-5/
S0002-9947-1994-1220906-5.pdf), Transactions of the American Mathematical Society 342(2) (1994) 917-932
[9] Group = Hopf algebra « Secret Blogging Seminar (http://sbseminar.wordpress.com/2007/10/07/group-hopf-algebra/), Group objects and
Hopf algebras (http://www.youtube.com/watch?v=p3kkm5dYH-w), video of Simon Willerton.
References
• Dascalescu, Sorin; Nastasescu, Constantin; Raianu, §erban (2001), Hopf Algebras, Pure and Applied
Mathematics, 235 (1st ed.), Marcel Dekker, ISBN 0-8247-0481-9.
• Pierre Cartier, A primer of Hopf algebras (http://inc.web.ihes.fr/prepub/PREPRINTS/2006/M/M-06-40.
pdf), IHES preprint, September 2006, 81 pages
• Jurgen Fuchs, Affine Lie Algebras and Quantum Groups, (1992), Cambridge University Press. ISBN
0-521-48412-X
• H. Hopf, Uber die Topologie der Gruppen-Mannigfaltigkeiten und ihrer Verallgemeinerungen, Ann. of Math. 42
(1941), 22-52. Reprinted in Selecta Heinz Hopf, pp. 119-151, Springer, Berlin (1964). MR4784
• Street, Ross (2007), Quantum groups, Australian Mathematical Society Lecture Series, 19, Cambridge University
Press, MR2294803, ISBN 978-0-521-69524-4; 978-0-521-69524-4.
Quantum group
101
Quantum group
In mathematics and theoretical physics, the term quantum group denotes various kinds of noncommutative algebra
with additional structure. In general, a quantum group is some kind of Hopf algebra. There is no single,
all-encompassing definition, but instead a family of broadly similar objects.
The term "quantum group" often denotes a kind of noncommutative algebra with additional structure that first
appeared in the theory of quantum integrable systems, and which was then formalized by Vladimir Drinfel'd and
Michio Jimbo as a particular class of Hopf algebra. The same term is also used for other Hopf algebras that deform
or are close to classical Lie groups or Lie algebras, such as a v bicrossproduct' class of quantum groups introduced by
Shahn Majid a little after the work of Drinfeld and Jimbo.
In Drinfeld's approach, quantum groups arise as Hopf algebras depending on an auxiliary parameter q or h, which
become universal enveloping algebras of a certain Lie algebra, frequently semisimple or affine, when q = 1 or h = 0.
Closely related are certain dual objects, also Hopf algebras and also called quantum groups, deforming the algebra of
functions on the corresponding semisimple algebraic group or a compact Lie group.
Just as groups often appear as symmetries, quantum groups act on many other mathematical objects and it has
become fashionable to introduce the adjective quantum in such cases; for example there are quantum planes and
quantum Grassmannians.
Intuitive meaning
The discovery of quantum groups was quite unexpected, since it was known for a long time that compact groups and
semisimple Lie algebras are "rigid" objects, in other words, they cannot be "deformed". One of the ideas behind
quantum groups is that if we consider a structure that is in a sense equivalent but larger, namely a group algebra or a
universal enveloping algebra, then a group or enveloping algebra can be "deformed", although the deformation will
no longer remain a group or enveloping algebra. More precisely, deformation can be accomplished within the
category of Hopf algebras that are not required to be either commutative or cocommutative. One can think of the
deformed object as an algebra of functions on a "noncommutative space", in the spirit of the noncommutative
geometry of Alain Connes. This intuition, however, came after particular classes of quantum groups had already
proved their usefulness in the study of the quantum Yang-Baxter equation and quantum inverse scattering method
developed by the Leningrad School (Ludwig Faddeev, Leon Takhtajan, Evgenii Sklyanin, Nicolai Reshetikhin and
Korepin) and related work by the Japanese School. 1 ^ The intuition behind the second, bicrossproduct, class of
quantum groups was different and came from the search for self-dual objects as an approach to quantum gravity 1 J .
Drinfel'd- Jimbo type quantum groups
One type of objects commonly called a "quantum group" appeared in the work of Vladimir Drinfel'd and Michio
Jimbo as a deformation of the universal enveloping algebra of a semisimple Lie algebra or, more generally, a
Kac-Moody algebra, in the category of Hopf algebras. The resulting algebra has additional structure, making it into a
quasitriangular Hopf algebra.
Let A = (a^)be the Cartan matrix of the Kac-Moody algebra, and let q be a nonzero complex number distinct
from 1, then the quantum group, U q {G) , where G is the Lie algebra whose Cartan matrix is A, is defined as the
unital associative algebra with generators k\ (where \ is an element of the weight lattice, i.e.
2(A, cu) I (o^, OLi) € Z for all /), and and /• (for simple roots, oti ), subject to the following relations:
• k 0 = 1,
Quantum group
102
ki — jfc" 1
[ e ii fj] = $ij ~'
ft - ft
. V (-l) n - ai3 'k' ! e n e ■ e 1 ~ aij ~ n = 0,fori^j,
where = fc^, ^ = qhip^w) , [0] g J = 1, [n] g J = [m] gi for all positive integers n, and
m=l
q™ - q 7™>
\rri\ q . = — .These are the q-factorial and q-number, respectively, the q-analogs of the ordinary factorial.
<li ~ Qi
The last two relations above are the g-Serre relations, the deformations of the Serre relations.
In the limit as q —> 1 , these relations approach the relations for the universal enveloping algebra U [G) , where
k\ — k_\
kx —> 1 and > i A as q — > 1 , where the element, t\ , of the Cartan subalgebra satisfies
q-q- 1
(t^ : h) = A(/i) for all h in the Cartan subalgebra.
There are various coassociative coproducts under which these algebras are Hopf algebras, for example,
• Ai(fc A ) = k x ® fc A , Ai( ei ) = 1 ® e; + e { ® fc i? Ai(/0 = fc," 1 ® £ + £ ® 1,
• A 2 (fc A ) = k x ®k x , A 2 (ei) = fcr 1 <g> + e< <g> 1 , A 2 (/ i ) = 1 ® £ + £ ® A* ,
• A 3 (fc A ) = k\ k\, A 3 (ei) = k;* ® e, + e, ® fc/> A 3 (£) = fc^ ® + /i ® fc?' where the
set of generators has been extended, if required, to include fc A for X which is expressible as the sum of an
element of the weight lattice and half an element of the root lattice.
In addition, any Hopf algebra leads to another with reversed copproduct T o A » where J 1 is given by
T[x ® y) = y ® x , giving three more possible versions.
The counit on U q (A)is the same for all these coproducts: e(/c A ) = 1, e(e^) = 0, e(fi) = 0, and the
respective antipodes for the above coproducts are given by
• S'i(fc A ) = fc_ A , 5i(ci) = -ejfcr 1 , Si(/i) =
• S 2 (fc A ) = fc-A 5 -Safe) = -kie h S 2 (fi) = -fiK 1 ,
• S 3 (k x ) = fc_ A , 5 3 (ei) = -fcei, 5 3 (/i) = -g x rl /i-
Alternatively, the quantum group [/ g (G)can be regarded as an algebra over the field C(q) , the field of all
rational functions of an indeterminate q over C •
Similarly, the quantum group [/ g (G)can be regarded as an algebra over the field Q(g), the field of all rational
functions of an indeterminate q over (Q) (see below in the section on quantum groups at q = 0).
Representation theory
Just as there are many different types of representations for Kac-Moody algebras and their universal enveloping
algebras, so there are many different types of representation for quantum groups.
As is the case for all Hopf algebras, U q {G) has an adjoint representation on itself as a module, with the action being
given by Ad -2/ = J2 X W yS ( X ^> where A ( x ) = E x (l) ® X W.
(x) (*)
Quantum group
103
Case 1: q is not a root of unity
One important type of representation is a weight representation, and the corresponding module is called a weight
module. A weight module is a module with a basis of weight vectors. A weight vector is a nonzero vector v such that
k\.v = d\V for all \ , where d\ are complex numbers for all weights \ such that
• do = l,
• d\d^ — dx+p , for all weights \ and ft .
A weight module is called integrable if the actions of and f± are locally nilpotent (i.e. for any vector v in the
module, there exists a positive integer k, possibly dependent on v, such that e±.V = fj*.v = Ofor all /). In the case
of integrable modules, the complex numbers d\ associated with a weight vector satisfy d\ = C\q^ X ^ » where v
is an element of the weight lattice, and C\ are complex numbers such that
• c 0 = 1,
• c \ c fi — c A+/i , for all weights \ and ft ,
• C2 ai = 1 for all /.
Of special interest are highest weight representations, and the corresponding highest weight modules. A highest
weight module is a module generated by a weight vector v, subject to k\.v = d^-ufor all weights \ , and
e im y = Ofor all i. Similarly, a quantum group can have a lowest weight representation and lowest weight module,
i.e. a module generated by a weight vector v, subject to k\.v — d\V for all weights \ , and fc.v = Ofor all /.
Define a vector v to have weight v if k\.v = q^ X ^v for all \ in the weight lattice.
If G is a Kac-Moody algebra, then in any irreducible highest weight representation of U q {G) , with highest weight
v , the multiplicities of the weights are equal to their multiplicities in an irreducible representation of JJ (G) with
equal highest weight. If the highest weight is dominant and integral (a weight ft is dominant and integral if ft
satisfies the condition that 2(/i, a^) / (a^, a^)is a non-negative integer for all /), then the weight spectrum of the
irreducible representation is invariant under the Weyl group for G, and the representation is intesrable.
Conversely, if a highest weight module is integrable, then its highest weight vector v satisfies k\.v = C\q^ ,v ^v »
where C\ are complex numbers such that
• Q) = 1,
• c X c fi = c A+/i , for all weights \ and ft ,
• c 2ai — 1 f° r a U U
and v is dominant and integral.
As is the case for all Hopf algebras, the tensor product of two modules is another module. For an element x of
U q (G), and for vectors v and w in the respective modules, x.(v ® w) = A(x).(i; ® w) » so that
k\.{v ®w) = k\.v <g> fc A .w , and in the case of coproduct Ai, ei.(v ® w) = ki.v <g> e^.iL? + ei.v ® w
and /-.(i; <g) w) = V <g) f t .w + /^.^ (g) k^.W •
The integrable highest weight module described above is a tensor product of a one-dimensional module (on which
k\ = C\ for all A , and a = fi = 0 for all /) and a highest weight module generated by a nonzero vector ,
subject to k\.VQ = q^^VQ^ov all weights \ , and e^.^o = Ofor all L
In the specific case where G is a finite-dimensional Lie algebra (as a special case of a Kac-Moody algebra), then the
irreducible representations with dominant integral highest weights are also finite-dimensional.
In the case of a tensor product of highest weight modules, its decomposition into submodules is the same as for the
tensor product of the corresponding modules of the Kac-Moody algebra (the highest weights are the same, as are
their multiplicities).
Quantum group
104
Quasitriangularity
Case 1: q is not a root of unity
Strictly, the quantum group U q {G) is not quasitriangular, but it can be thought of as being "nearly quasitriangular"
in that there exists an infinite formal sum which plays the role of an /^-matrix. This infinite formal sum is expressible
in terms of generators 6^ and f{ , and Cartan generators t\ , where k\ is formally identified with g** . The
infinite formal sum is the product of two factors, g 7 ? t^j®^ , and an infinite formal sum, where {A^} is a basis
for the dual space to the Cartan subalgebra, and {Uj\ is the dual basis, and Tj is a sign (+1 or -1).
The formal infinite sum which plays the part of the /^-matrix has a well-defined action on the tensor product of two
irreducible highest weight modules, and also on the tensor product if two lowest weight modules. Specifically, if v
has weight a and w has weight /3 , then g 7 ?^ *Aj ^ y ^ w ^ _ qVfaP) v ^ w , and the fact that the
modules are both highest weight modules or both lowest weight modules reduces the action of the other factor on
v & w to a finite sum.
Specifically, if V is a highest weight module, then the formal infinite sum, R, has a well-defined, and invertible,
action on V ®V> and this value of R (as an element of Hom(l^) ® Hom(l^)) satisfies the Yang-Baxter
equation, and therefore allows us to determine a representation of the braid group, and to define quasi-invariants for
knots, links and braids.
Quantum groups at q - 0
Masaki Kashiwara has researched the limiting behaviour of quantum groups as q — > 0 .
As a consequence of the defining relations for the quantum group U q (G) , U q (G) can be regarded as a Hopf
algebra over Q(g) , the field of all rational functions of an indeterminate q over Q .
For simple root o^and non-negative integer n , define g( n ) — e™ /[n] q . ! an d fj^ = f\p\ q .\ (specifically,
= fj® = 1). In an integrable module M » an d for weight A , a vector u £ M\ (i.e. a vector u in M
with weight \ ) can be uniquely decomposed into the sums
oo oo
where u n £ ker(e^) H M\+ noii , v n £ ker(/j) D M\- na , u n ^ Oonly if rc + Q ^ > Q, an d
v n 7^ 0 only if n — 7^—^ — ^ > 0 . Linear mappings &i : M —> M and f . • M — > M can be defined on
M A by
OO OO
r(n-l) (n+1)
n=l n=0
oc 00
71=0 71=1
Let A be the integral domain of all rational functions in Q(g) which are regular at q = 0(Le. a. rational function
/(q)is an element of A if an d only if there exist polynomials g(q) and h(q) in the polynomial ring Q[g] such
that h(0) 7^ 0 , and /(g) = g(q) /h(q) ). A crystal base for M is an ordered pair (L, B) , such that
• L is a free -submodule of M such that M = <8U
• B is a Q -basis of the vector space L/qL over Q 5
• L = @\L\ and B = U X B X , where L A = L H M A and 5 A = 5 n (L A /gL A ) 5
• e { L C L and c L for all i,
Quantum group
105
• e { B C B U {0} and f { BcBU {0} for all i,
• for all b € B and b ! G £?, and for all z, = b f if and only if fib* = 6.
To put this into a more informal setting, the actions of e^/, and are generally singular at 5 = Oon an
integrable module M • The linear mappings and f. on the module are introduced so that the actions of
and J^e^ are regular at g = Oon the module. There exists a Q(g) -basis of weight vectors £ for M , with
respect to which the actions of and ^ are regular at q = 0 for all /. The module is then restricted to the free A
-module generated by the basis, and the basis vectors, the A -submodule and the actions of and f. are evaluated
at 5 = 0. Furthermore, the basis can be chosen such that at q — 0 , for all % , and f { are represented by
edges. Each vertex of the graph represents an
element of the Q -basis B of L/qL , and a directed edge, labelled by /, and directed from vertex v ito vertex
V2, represents that fr 2 = ^^(and, equivalently, that b\ — €$2), where fe 1 is the basis element represented by
Vl, and 6 2 i s me basis element represented by V2. The graph completely determines the actions of and at
q = 0 . If an integrable module has a crystal base, then the module is irreducible if and only if the graph
representing the crystal base is connected (a graph is called "connected" if the set of vertices cannot be partitioned
into the union of nontrivial disjoint subsets l^and V^such that there are no edges joining any vertex in Vjto any
vertex in V2).
For any integrable module with a crystal base, the weight spectrum for the crystal base is the same as the weight
spectrum for the module, and therefore the weight spectrum for the crystal base is the same as the weight spectrum
for the corresponding module of the appropriate Kac-Moody algebra. The multiplicities of the weights in the crystal
base are also the same as their multiplicities in the corresponding module of the appropriate Kac-Moody algebra.
It is a theorem of Kashiwara that every integrable highest weight module has a crystal base. Similarly, every
integrable lowest weight module has a crystal base.
Tensor products of crystal bases
Let Afbe an integrable module with crystal base (L, B) and TVf'be an integrable module with crystal base
(Z/, B f ). For crystal bases, the coproduct A> given by
A(fc A ) = k x ® fc A , Afe) = e { <g> kr 1 + 1 ® e h A(fi) = f { ® 1 + k { ® f { , is adopted. The integrable
module M ®Q( q ) M f has crystal base (L ®^ L* \B <g> B') , where
B <g> B* = {b ®q b f : b € B, b f € B r } . For a basis vector b G B , define
6i(b) = max{n > 0 : e"b ^ 0} and 0.(6) = max{n > 0 : f?b ^ 0} • The actions of ^and £ on
^ b9b )-\bi»S i V i if 0 i (6)<e t (V) l
W6 ® 6J -\6®/^ if 0,(6) < 6,(6')-
The decomposition of the product two integrable highest weight modules into irreducible submodules is determined
by the decomposition of the graph of the crystal base into its connected components (i.e. the highest weights of the
submodules are determined, and the multiplicity of each highest weight is determined).
Quantum group
106
Compact matrix quantum groups
See also compact quantum group.
S.L. Woronowicz introduced compact matrix quantum groups. Compact matrix quantum groups are abstract
structures on which the "continuous functions" on the structure are given by elements of a C* -algebra. The geometry
of a compact matrix quantum group is a special case of a noncommutative geometry.
The continuous complex-valued functions on a compact Hausdorff topological space form a commutative
C*-algebra. By the Gelfand theorem, a commutative C*-algebra is isomorphic to the C*-algebra of continuous
complex-valued functions on a compact Hausdorff topological space, and the topological space is uniquely
determined by the C*-algebra up to homeomorphism.
For a compact topological group, G, there exists a C*-algebra homomorphism A : G(G) — > G(G) ® G(G)
(where G(G) ® (7(G) is the C*-algebra tensor product - the completion of the algebraic tensor product of G(G)
and G(G)), such that A(/)(x ? y) = f{xy) for all / £ G(G), and for all x, y G G (where
(/ ® 9){ x ? y) — f( x )9{y)^ or a ^ f->9^ G(G) and all x : y £ G ). There also exists a linear multiplicative
mapping k : G(G) G(G) , such that K (f) ( x ) = f{x~ l ) for all / £ G(G) and all x <E G • Strictly, this
does not make G(G) a Hopf algebra, unless G is finite. On the other hand, a finite-dimensional representation of G
can be used to generate a *-subalgebra of G(G) which is also a Hopf *-algebra. Specifically, if ^ i — > {v>ij{g))ij
is an n -dimensional representation of G > then ^ij £ G(G) for all i, J , and = 5^ w * fc ® ^fejfor all
i, j . It follows that the *-algebra generated by ^ijfor all i, j and ^(li^for all 2, j is a Hopf *-algebra: the
counit is determined by e{u{j) = 5^- for all i, j (where (5^- is the Kronecker delta), the antipode is ft , and the
Mttahb^Wiy.-^^^ - a pair (C, .) , where C is a C-a lg ee,a and
^ = (l^j*)^ n is a matrix with entries in (7 such that
• The *-subalgebra, Gq, of (7 » which is generated by the matrix elements of w , is dense in (7 j
• There exists a C*-algebra homomorphism A : G — > C ® C (where G ® C is the C*-algebra tensor
product - the completion of the algebraic tensor product of C and G ) such that
A (^j) = Y^ u ^® u kj
k
for all i, j ( A is called the ^multiplication);
• There exists a linear antimultiplicative map k : Gq — > Gq (the coinverse) such that «(ft(x;*)*) = for all
u (E Gq and 5^ K ^} l ik) u kj — u ik^{ u kj) — where / is the identity element of C • Since K is
A;
antimultiplicative, then k(vw) = ft(tL?)ft(?;) for all v^w G Gq.
As a consequence of continuity, the comultiplication on G is coassociative.
In general, G is n °t a bialgebra, and Gois a Hopf *-algebra.
Informally, G can be regarded as the *-algebra of continuous complex- valued functions over the compact matrix
quantum group, and u can be regarded as a finite-dimensional representation of the compact matrix quantum group.
A representation of the compact matrix quantum group is given by a corepresentation of the Hopf * -algebra (a
corepresentation of a counital coassociative coalgebra A is a square matrix v = (vij)i : j=i jm .. :7l with entries in A
71
(so v £ M n (A) ) such that A{v{j) = ^ ® v^jfor all i, j and e(vij) = 5{j for all i, j ). Furthermore,
fc=l
a representation v, is called unitary if the matrix for v is unitary (or equivalently, if K>{vij) — Vji for all ij).
An example of a compact matrix quantum group is S 11^(2) , where the parameter ft is a positive real number. So
SU^{2) = (G(S , [/ /i (2), it), where C{SU^ (2)) is the C*-algebra generated by a and 7,subjectto
77* — 7*7? a 7 — AH^i a 7* = / i 7* a ? aa * + ^7*7 = a * a + / i_1 7*7 — ^>
Quantum group
107
and w =
( OL T \
and u = I ^ # 1 , so that the comultiplication is determined by A (a) = a. ® a. — 7 £5 7*,
A(7) = a®7 + 7®a*, and the coinverse is determined by k{o) = a*, ^(7) = — A i_1 7'
^(7*) = —fij* , = Q . Note that u is a representation, but not a unitary representation, w is equivalent
to the unitary representation v = I * * 1 ■
Equivalently, = (0(517^(2)), where C(5C/ /z (2))is the C*-algebra generated by a and j3
, subject to
/3fi* = /?*/?, a/3 = fi/3a, a{3* = fif3*a, aa* + (i 2 p*P = a*a + (3* (3 = I,
^* ,S0 ^ at ^ e comu l t iP^ cat i° n i s determined by A (a) = a ® a — fi/3 ® /?*,
A(/3) = a®/3 + /?®a*, and the coinverse is determined by k{ql) = a*, = —
K,(j3*) = —flft* , = a . Note that luis a unitary representation. The realizations can be identified by
equating 7 = \fji<(3 •
When fj, = 1, then S'[/ /X (2)is equal to the algebra C(iS'C/(2))of functions on the concrete compact group
517(2).
Bicrossproduct quantum groups
Whereas compact matrix pseudogroups are typically versions of Drinfeld-Jimbo quantum groups in a dual function
algebra formulation, with additional structure, the bicrossproduct ones are a distinct second family of quantum
groups of increasing importance as deformations of solvable rather than semisimple Lie groups. They are associated
to Lie splittings of Lie algebras or local factorisations of Lie groups and can be viewed as the cross product or
Mackey quantisation of one of the factors acting on the other for the algebra and a similar story for the coproduct A
with the second factor acting back on the first. The very simplest nontrivial example corresponds to two copies of
locally acting on each other and results in a quantum group (given here in an algebraic form) with generators
p, K : K —1 > say, and coproduct
[p, K] = hK(K -l),Ap = p®K + l®p,AK = K®K
where h is me deformation parameter. This quantum group was linked to a toy model of Planck scale physics
implementing Born reciprocity when viewed as a deformation of the Heisenberg algebra of quantum mechanics.
Also, starting with any compact real form of a semisimple Lie algebra 9 its complexification as a real Lie algebra of
twice the dimension splits into 9 and a certain solvable Lie algebra (the Iwasawa decomposition), and this provides
a canonical bicrossproduct quantum group associated to 9 • For su(2) one obtains a quantum group deformation of
the Euclidean group E(3) of motions in 3 dimensions.
Notes
[1] Schwiebert, Christian (1994), Generalized quantum inverse scattering, arXiv:hep-th/9412237v3
[2] Majid, Shahn (1988), "Hopf algebras for physics at the Planck scale", Classical and Quantum Gravity 5: 1587-1607,
doi: 10. 1088/0264-9381/5/12/010
References
• Podles, P.; Muller, E., Introduction to quantum groups, arXiv:q-alg/97 04002
• Kassel, Christian (1995), Quantum groups, Graduate Texts in Mathematics, 155, Berlin, New York:
Springer- Verlag, MR1321145, ISBN 978-0-387-94370-1
• Majid, Shahn (2002), A quantum groups primer, London Mathematical Society Lecture Note Series, 292,
Cambridge University Press, MR1904789, ISBN 978-0-521-01041-2
Quantum group
108
• Street, Ross (2007), Quantum groups, Australian Mathematical Society Lecture Series, 19, Cambridge University
Press, MR2294803, ISBN 978-0-521-69524-4; 978-0-521-69524-4.
• Majid, Shahn (January 2006), "What Is... a Quantum Group?" (http://www.ams.org/notices/200601/what-is.
pdf) (PDF), Notices of the American Mathematical Society 53 (1): 30-31, retrieved 2008-01-16
• Shnider, Steven; Sternberg, Shlomo (1993) Quantum groups. From coalgebras to Drinfel'd algebras. A guided
tour. Graduate Texts in Mathematical Physics, II. International Press, Cambridge, MA.
Affine quantum group
In mathematics, a quantum affine algebra (or affine quantum group) is a Hopf algebra that is a ^-deformation of
the universal enveloping algebra of an affine Lie algebra. They were introduced independently by Drinfeld (1985)
and Jimbo (1985) as a special case of their general construction of a quantum group from a Cartan matrix. One of
their principal applications has been to the theory of solvable lattice models in quantum statistical mechanics, where
the Yang-Baxter equation occurs with a spectral parameter. Combinatorial aspects of the representation theory of
quantum affine algebras can be described simply using crystal bases, which correspond to the degenerate case when
the deformation parameter q vanishes and the Hamiltonian of the associated lattice model can be explicitly
diagonalized.
References
• Drinfeld, V. G. (1985), "Hopf algebras and the quantum Yang-Baxter equation", Doklady Akademii Nauk SSSR
283 (5): 1060-1064, MR802128, ISSN 0002-3264
• Drinfeld, V. G. (1987), "A new realization of Yangians and of quantum affine algebras", Doklady Akademii Nauk
SSSR 296 (1): 13-17, MR914215, ISSN 0002-3264
• Frenkel, Igor B.; Reshetikhin, N. Yu. (1992), "Quantum affine algebras and holonomic difference equations" ^\
Communications in Mathematical Physics 146 (1): 1-60, doi:10.1007/BF02099206, MR1 163666,
ISSN 0010-3616
• Jimbo, Michio (1985), "A q-difference analogue of U(g) and the Yang-Baxter equation", Letters in Mathematical
Physics. 10 (1): 63-69, doi:10.1007/BF00704588, MR797001, ISSN 0377-9017
• Jimbo, Michio; Miwa, Tetsuji (1995), Algebraic analysis of solvable lattice models, CBMS Regional Conference
Series in Mathematics, 85, Published for the Conference Board of the Mathematical Sciences, Washington, DC,
MR1308712, ISBN 978-0-8218-0320-2
References
[ 1 ] http://proj ecteuclid. org/getRecord?id=euclid. cmp/ 1 1 04249974
Affine Lie algebra
109
Affine Lie algebra
In mathematics, an affine Lie algebra is an infinite-dimensional Lie algebra that is constructed in a canonical
fashion out of a finite-dimensional simple Lie algebra. It is a Kac-Moody algebra whose generalized Cartan matrix
is positive semi-definite and has corank 1. From purely mathematical point of view, affine Lie algebras are
interesting because their representation theory, like representation theory of finite dimensional, semisimple Lie
algebras is much better understood than that of general Kac-Moody algebras. As observed by Victor Kac, the
character formula for representations of affine Lie algebras implies certain combinatorial identities, the Macdonald
identities.
Affine Lie algebras play an important role in string theory and conformal field theory due to the way they are
constructed: starting from a simple Lie algebra 0, one considers the loop algebra, Lg, formed by the 0 -valued
functions on a circle (interpreted as the closed string) with pointwise commutator. The affine Lie algebra gis
obtained by adding one extra dimension to the loop algebra and modifying a commutator in a non-trivial way, which
physicists call a quantum anomaly. The point of view of string theory helps to understand many deep properties of
affine Lie algebras, such as the fact that the characters of their representations are given by modular forms.
Affine Lie algebras from simple Lie algebras
Construction
If 0is a finite dimensional simple Lie algebra, the corresponding affine Lie algebra gis constructed as a central
extension of the infinite-dimensional Lie algebra g (g) C[t, t -1 ] , with one-dimensional center Cc. As a vector
space,
0 = A ® C[M _1 ] © Cc,
where C[f, £ _1 ]is the complex vector space of Laurent polynomials in the indeterminate t. The Lie bracket is
defined by the formula
[a ® t n + ac, b®t rn + (3c} = [a, b] <g> i n+m + {a^nS^^c
for all a, b <E 0, a, (3 G C and n, m £ Z , where [a, b] is the Lie bracket in the Lie algebra 0and (-| ■) is the
Cartan-Killing form on 0.
The affine Lie algebra corresponding to a finite-dimensional semisimple Lie algebra is the direct sum of the affine
Lie algebras corresponding to its simple summands.
Constructing the Dynkin diagrams
The Dynkin diagram of each affine Lie algebra consists of that of the corresponding simple Lie algebra plus an
additional node, which corresponds to the addition of an imaginary root. Of course, such a node cannot be attached
to the Dynkin diagram in just any location, but for each simple Lie algebra there exists a number of possible
attachments equal to the cardinality of the group of outer automorphisms of the Lie algebra. In particular, this group
always contains the identity element, and the corresponding affine Lie algebra is called an untwisted affine Lie
algebra. When the simple algebra admits automorphisms that are not inner automorphisms, one may obtain other
Dynkin diagrams and these correspond to twisted affine Lie algebras.
Affine Lie algebra
110
Classifying the central extensions
The attachment of an extra node to the Dynkin diagram of the corresponding simple Lie algebra corresponds to the
following construction. An affine Lie algebra can always be constructed as a central extension of the loop algebra of
the corresponding simple Lie algebra. If one wishes to begin instead with a semisimple Lie algebra, then one needs
to centrally extend by a number of elements equal to the number of simple components of the semisimple algebra. In
physics, one often considers instead the direct sum of a semisimple algebra and an abelian algebra . In this case
one also needs to add n further central elements for the n abelian generators.
The second integral cohomology of the loop group of the corresponding simple compact Lie group is isomorphic to
the integers. Central extensions of the affine Lie group by a single generator are topologically circle bundles over
this free loop group, which are classified by a two-class known as the first Chern class of the fibration. Therefore the
central extensions of an affine Lie group are classified by a single parameter k which is called the central charge in
the physics literature, where it first appeared. Unitary highest weight representations of the affine compact groups
only exist when k is a natural number. More generally, if one considers a semi- simple algebra, there is a central
charge for each simple component.
Applications
They appear naturally in theoretical physics (for example, in conformal field theories such as the WZW model and
coset models and even on the worldsheet of the heterotic string), geometry, and elsewhere in mathematics.
References
• Di Francesco, P.; Mathieu, P.; Senechal, D. (1997), Conformal Field Theory, Springer '-Verlag, ISBN
0-387-94785-X
• Fuchs, Jurgen (1992), Affine Lie Algebras and Quantum Groups, Cambridge University Press, ISBN
0-521-48412-X
• Goddard, Peter; Olive, David (1988), Kac-Moody and Virasoro algebras: A Reprint Volume for Physicists,
Advanced Series in Mathematical Physics, 3, World Scientific, ISBN 9971-50-419-7
• Kac, Victor (1990), Infinite dimensional Lie algebras (3 ed.), Cambridge University Press, ISBN 0-521-46693-8
• Kohno, Toshitake (1998), Conformal Field Theory and Topology, American Mathematical Society, ISBN
0-8218-2130-X
• Pressley, Andrew; Segal, Graeme (1986), Loop groups, Oxford University Press, ISBN 0-19-853535-X
Quantum affine algebra
111
Quantum affine algebra
In mathematics, a quantum affine algebra (or affine quantum group) is a Hopf algebra that is a ^-deformation of
the universal enveloping algebra of an affine Lie algebra. They were introduced independently by Drinfeld (1985)
and Jimbo (1985) as a special case of their general construction of a quantum group from a Cartan matrix. One of
their principal applications has been to the theory of solvable lattice models in quantum statistical mechanics, where
the Yang-Baxter equation occurs with a spectral parameter. Combinatorial aspects of the representation theory of
quantum affine algebras can be described simply using crystal bases, which correspond to the degenerate case when
the deformation parameter q vanishes and the Hamiltonian of the associated lattice model can be explicitly
diagonalized.
References
• Drinfeld, V. G. (1985), "Hopf algebras and the quantum Yang-Baxter equation", Doklady Akademii Nauk SSSR
283 (5): 1060-1064, MR802128, ISSN 0002-3264
• Drinfeld, V. G. (1987), "A new realization of Yangians and of quantum affine algebras", Doklady Akademii Nauk
SSSR 296 (1): 13-17, MR914215, ISSN 0002-3264
• Frenkel, Igor B.; Reshetikhin, N. Yu. (1992), "Quantum affine algebras and holonomic difference equations" ^\
Communications in Mathematical Physics 146 (1): 1-60, doi:10.1007/BF02099206, MR1 163666,
ISSN 0010-3616
• Jimbo, Michio (1985), "A q-difference analogue of U(g) and the Yang-Baxter equation", Letters in Mathematical
Physics. 10 (1): 63-69, doi:10.1007/BF00704588, MR797001, ISSN 0377-9017
• Jimbo, Michio; Miwa, Tetsuji (1995), Algebraic analysis of solvable lattice models, CBMS Regional Conference
Series in Mathematics, 85, Published for the Conference Board of the Mathematical Sciences, Washington, DC,
MR1308712, ISBN 978-0-8218-0320-2
Operator algebra
112
Operator algebra
In functional analysis, an operator algebra is an algebra of continuous linear operators on a topological vector space
with the multiplication given by the composition of mappings. Although it is usually classified as a branch of
functional analysis, it has direct applications to representation theory, differential geometry, quantum statistical
mechanics and quantum field theory.
Such algebras can be used to study arbitrary sets of operators with little algebraic relation simultaneously. From this
point of view, operator algebras can be regarded as a generalization of spectral theory of a single operator. In general
operator algebras are non-commutative rings.
An operator algebra is typically required to be closed in a specified operator topology inside the algebra of the whole
continuous linear operators. In particular, it is a set of operators with both algebraic and topological closure
properties. In some disciplines such properties are axiomized and algebras with certain topological structure become
the subject of the research.
Though algebras of operators are studied in various contexts (for example, algebras of pseudo-differential operators
acting on spaces of distributions), the term operator algebra is usually used in reference to algebras of bounded
operators on a Banach space or, even more specially in reference to algebras of operators on a separable Hilbert
space, endowed with the operator norm topology.
In the case of operators on a Hilbert space, the adjoint map on operators gives a natural involution which provides an
additional algebraic structure which can be imposed on the algebra. In this context, the best studied examples are
self-adjoint operator algebras, meaning that they are closed under taking adjoints. These include C*-algebras and von
Neumann algebras. C*-algebras can be easily characterized abstractly by a condition relating the norm, involution
and multiplication. Such abstractly defined C* -algebras can be identified to a certain closed subalgebra of the
algebra of the continuous linear operators on a suitable Hilbert space. A similar result holds for von Neumann
algebras.
Commutative self-adjoint operator algebras can be regarded as the algebra of complex valued continuous functions
on a locally compact space, or that of measurable functions on a standard measurable space. Thus, general operator
algebras are often regarded as a noncommutative generalizations of these algebras, or the structure of the base space
on which the functions are defined. This point of view is elaborated as the philosophy of noncommutative geometry,
which tries to study various non-classical and/or pathological objects by noncommutative operator algebras.
Examples of operator algebras which are not self-adjoint include:
• nest algebras
• many commutative subspace lattice algebras
• many limit algebras
References
• Blackadar, Bruce (2005). Operator Algebras: Theory of C*- Algebras and von Neumann Algebras. Encyclopaedia
of Mathematical Sciences. Springer- Verlag. ISBN 3540284869.
Clifford algebra
113
Clifford algebra
In mathematics, Clifford algebras are a type of associative algebra. They can be thought of as one of the possible
generalizations of the complex numbers and quaternions. The theory of Clifford algebras is intimately connected
with the theory of quadratic forms and orthogonal transformations. Clifford algebras have important applications in a
variety of fields including geometry and theoretical physics. They are named after the English geometer William
Kingdon Clifford.
Introduction and basic properties
Specifically, a Clifford algebra is a unital associative algebra which contains and is generated by a vector space V
equipped with a quadratic form Q. The Clifford algebra C£(V,Q) is the "freest" algebra generated by V subject to the
condition^
v 2 = Q(v)l for all vtV.
If the characteristic of the ground field K is not 2, then one can rewrite this fundamental identity in the form
uv + vu = 2(ii 5 v) for all v € V,
where <u, v> = l /2(Q(u + v) - Q(u) - Q(v)) is the symmetric bilinear form associated to Q, via the polarization
identity. The idea of being the "freest" or "most general" algebra subject to this identity can be formally expressed
through the notion of a universal property, as done below.
Quadratic forms and Clifford algebras in characteristic 2 form an exceptional case. In particular, if char K = 2 it is
not true that a quadratic form determines a symmetric bilinear form, or that every quadratic form admits an
orthogonal basis. Many of the statements in this article include the condition that the characteristic is not 2, and are
false if this condition is removed.
As quantization of exterior algebra
Clifford algebras are closely related to exterior algebras. In fact, if Q = 0 then the Clifford algebra C£(V,Q) is just the
exterior algebra A(V). For nonzero Q there exists a canonical linear isomorphism between A(V) and C£(V,Q)
whenever the ground field K does not have characteristic two. That is, they are naturally isomorphic as vector spaces,
but with different multiplications (in the case of characteristic two, they are still isomorphic as vector spaces, just not
naturally). Clifford multiplication is strictly richer than the exterior product since it makes use of the extra
information provided by Q.
More precisely, Clifford algebras may be thought of as quantizations (cf. quantization (physics), Quantum group) of
the exterior algebra, in the same way that the Weyl algebra is a quantization of the symmetric algebra.
Weyl algebras and Clifford algebras admit a further structure of a *-algebra, and can be unified as even and odd
terms of a superalgebra, as discussed in CCR and CAR algebras.
Universal property and construction
Let V be a vector space over a field K, and let Q : V —> K be a quadratic form on V. In most cases of interest the field
K is either R, C or a finite field.
A Clifford algebra C£(V,Q) is a unital associative algebra over K together with a linear map i : V —> C£{V,Q)
2
satisfying i(v) = Q(v)l for all v E V, defined by the following universal property: Given any associative algebra A
over K and any linear map j : V —> A such that
j( v ) 2 =Q(v)l for alive V
Clifford algebra
114
(where 1 denotes the multiplicative identity of A), there is a unique algebra homomorphism / : C£(V,Q) —> A such
that the following diagram commutes (i.e. such that/o i = j):
v — l -+ct(y,Q)
*
A
Working with a symmetric bilinear form < , > instead of Q (in characteristic not 2), the requirement on j is
j(v)j(w) + j(w)j(v) = 2<v, w> for all v, w £ V.
A Clifford algebra as described above always exists and can be constructed as follows: start with the most general
algebra that contains V, namely the tensor algebra T(V), and then enforce the fundamental identity by taking a
suitable quotient. In our case we want to take the two-sided ideal / in T(V) generated by all elements of the form
v ® v - Q(v)lfor all y £ V
and define C£(V,Q) as the quotient
C£(V,Q) = T(V)/I Q
It is then straightforward to show that C£(V,Q) contains V and satisfies the above universal property, so that CI is
unique up to a unique isomorphism; thus one speaks of "the" Clifford algebra C€(V, Q). It also follows from this
construction that i is injective. One usually drops the i and considers V as a linear subspace of C£(V,Q).
The universal characterization of the Clifford algebra shows that the construction of C£(V,Q) is functorial in nature.
Namely, C£ can be considered as a functor from the category of vector spaces with quadratic forms (whose
morphisms are linear maps preserving the quadratic form) to the category of associative algebras. The universal
property guarantees that linear maps between vector spaces (preserving the quadratic form) extend uniquely to
algebra homomorphisms between the associated Clifford algebras.
Basis and dimension
If the dimension of V is n and { , . . . ,e } is a basis of V, then the set
1 n
{e^e^ - -e ik | 1 < t x < i 2 < ■ ■ - < ik < n and 0 < k < n}
is a basis for C£(V,Q). The empty product (k = 0) is defined as the multiplicative identity element. For each value of
k there are n choose k basis elements, so the total dimension of the Clifford algebra is
dimC*(KQ) = £Q =T.
Since V comes equipped with a quadratic form, there is a set of privileged bases for V: the orthogonal ones. An
orthogonal basis is one such that
(e;,^) =0 i^j.
where <•,•> is the symmetric bilinear form associated to Q. The fundamental Clifford identity implies that for an
orthogonal basis
This makes manipulation of orthogonal basis vectors quite simple. Given a product e^e^ • • • e^ fc of distinct
orthogonal basis vectors, one can put them into standard order by including an overall sign corresponding to the
number of flips needed to correctly order them (i.e. the signature of the ordering permutation).
If the characteristic is not 2 then an orthogonal basis for V exists, and one can easily extend the quadratic form on V
to a quadratic form on all of C£(V,Q) by requiring that distinct elements e^e^ • * * e^ fc are orthogonal to one another
whenever the {e.}'s are orthogonal. Additionally, one sets
Clifford algebra
115
Q{e h e i2 • • -e ifc ) = Q{e il )Q{e i2 ) • -Q(e ik ).
2
The quadratic form on a scalar is just Q(X) = X . Thus, orthogonal bases for V extend to orthogonal bases for
C£(V,<2). The quadratic form defined in this way is actually independent of the orthogonal basis chosen (a
basis-independent formulation will be given later).
Examples: real and complex Clifford algebras
The most important Clifford algebras are those over real and complex vector spaces equipped with nondegenerate
quadratic forms. The geometric interpretation of nondegenerate Clifford algebras is known as geometric algebra.
Every nondegenerate quadratic form on a finite-dimensional real vector space is equivalent to the standard diagonal
form:
Q{v)=vl + --- + vl-vl +l v 2 p+g
where n = p + q is the dimension of the vector space. The pair of integers (/?, q) is called the signature of the
quadratic form. The real vector space with this quadratic form is often denoted R p,q . The Clifford algebra on R p,q is
denoted C€ p ^(R). The symbol CiJJL) means either Cl^ Q (R) or C£ Q ^(R) depending on whether the author prefers
positive definite or negative definite spaces.
A standard orthonormal basis {e.} for R p,q consists of n = p + q mutually orthogonal vectors, p of which have norm
+1 and q of which have norm -1. The algebra C^^(R) will therefore have p vectors which square to +1 and q
vectors which square to -1.
Note that C^ 0Q (R) is naturally isomorphic to R since there are no nonzero vectors. Cl^ ^R) is a two-dimensional
algebra generated by a single vector which squares to -1, and therefore is isomorphic to C, the field of complex
numbers. The algebra C£ Q2 (R) is a four-dimensional algebra spanned by {1, e^e^}. The latter three elements
square to -1 and all anticommute, and so the algebra is isomorphic to the quaternions H. The next algebra in the
sequence is Cl^ 3 (R) is an 8 -dimensional algebra isomorphic to the direct sum H © H called split-biquaternions.
One can also study Clifford algebras on complex vector spaces. Every nondegenerate quadratic form on a complex
vector space is equivalent to the standard diagonal form
Q(z) = zl + 4 + --- + zl
where n = dim V, so there is essentially only one Clifford algebra in each dimension. We will denote the Clifford
algebra on C n with the standard quadratic form by C^(C). One can show that the algebra Ci^iC) m ay be obtained as
the complexification of the algebra C€ p ^(R) where n=p + q:
ci n {c) = ce Ptq (R) <g> c = ce{c p + q , q ® c).
Here Q is the real quadratic form of signature (p,q). Note that the complexification does not depend on the signature.
The first few cases are not hard to compute. One finds that
C^ 0 (C) = C
C^(C) = c © c
C£ 2 (C) = M 2 (C)
where M (C) denotes the algebra of 2x2 matrices over C.
It turns out that every one of the algebras CI (R) and C^(C) is isomorphic to a matrix algebra over R, C, or H or
to a direct sum of two such algebras. For a complete classification of these algebras see classification of Clifford
algebras.
Clifford algebra
116
Properties
Relation to the exterior algebra
Given a vector space V one can construct the exterior algebra A(V), whose definition is independent of any quadratic
form on V. It turns out that if F does not have characteristic 2 then there is a natural isomorphism between A(V) and
C£(V,Q) considered as vector spaces (and there exists an isomorphism in characteristic two, which may not be
natural). This is an algebra isomorphism if and only if Q = 0. One can thus consider the Clifford algebra C£(V,Q) as
an enrichment (or more precisely, a quantization, cf. the Introduction) of the exterior algebra on V with a
multiplication that depends on Q (one can still define the exterior product independent of Q).
The easiest way to establish the isomorphism is to choose an orthogonal basis {e.} for V and extend it to an
orthogonal basis for C£(V,Q) as described above. The map C£(V,Q) —> A(V) is determined by
e h e i2 ■••e^e il Ae, 2 A---A e ik .
Note that this only works if the basis {e.} is orthogonal. One can show that this map is independent of the choice of
orthogonal basis and so gives a natural isomorphism.
If the characteristic of K is 0, one can also establish the isomorphism by antisymmetrizing. Define functions : V x
... xV^C£(V,g)by
fk(v u ' ' ' ,v k ) = ^ J2 s S n ( a ) Mi) * * * M*)
" <rES k
where the sum is taken over the symmetric group on k elements. Since/ is alternating it induces a unique linear map
k
A (V) —> C€(V,Q). The direct sum of these maps gives a linear map between A(V) and C£(V,Q). This map can be
shown to be a linear isomorphism, and it is natural.
A more sophisticated way to view the relationship is to construct a filtration on C£(V,Q). Recall that the tensor
algebra T(V) has a natural filtration: F® C F l C F 2 C ... where F k contains sums of tensors with rank < k. Projecting
this down to the Clifford algebra gives a filtration on C£{V,Q). The associated graded algebra
Gr F C£(V,Q) = ®F k /F k - 1
k
is naturally isomorphic to the exterior algebra A(V). Since the associated graded algebra of a filtered algebra is
always isomorphic to the filtered algebra as filtered vector spaces (by choosing complements of F k inF k+l for all k),
this provides an isomorphism (although not a natural one) in any characteristic, even two.
Grading
In the following, assume that the characteristic is not 2.
Clifford algebras are Z 2 -graded algebras (also known as superalgebras). Indeed, the linear map on V defined by
V — v (reflection through the origin) preserves the quadratic form Q and so by the universal property of Clifford
algebras extends to an algebra automorphism
a : C€(V,Q) -> C€(V,Q).
Since a is an involution (i.e. it squares to the identity) one can decompose C£(V,Q) into positive and negative
eigenspaces
Cl{V t Q) = C£°(V, Q) e C£\V, Q)
where C£ l (V,Q) ={xE C£(V,Q) I a(x) = (-1) l x}. Since a is an automorphism it follows that
CV{V, Q)C£ j {V, Q) = C£ i+j {V, Q)
where the superscripts are read modulo 2. This gives Ct(V,Q) the structure of a Z 2 -graded algebra. The subspace
Q) forms a subalgebra of C£(V,<2), called the even subalgebra. The subspace CI (V 9 Q) is called the odd part
of C£(V,Q) (it is not a subalgebra). The Z 2 -grading plays an important role in the analysis and application of Clifford
Clifford algebra
117
algebras. The automorphism a is called the main involution or grade involution.
Remark. In characteristic not 2 the underlying vector space of C£(V,Q) inherits a Z-grading from the canonical
isomorphism with the underlying vector space of the exterior algebra A(V). It is important to note, however, that this
is a vector space grading only. That is, Clifford multiplication does not respect the Z-grading, only the Z-grading:
for instance if Q(y) ^ 0, then y £ C^(V,Q), but y 2 <= C£°(V : Q), not in C£ 2 (V 5 Q). Happily, the
gradings are related in the natural way: Z 2 = Z/2Z. Further, the Clifford algebra is Z-filtered:
Ci-^V, Q) • C£- j (V : Q) C Cl- i+j (V, Q) • The degree of a Clifford number usually refers to the degree in
the Z-grading. Elements which are pure in the Z-grading are simply said to be even or odd.
The even subalgebra C£°(V,Q) of a Clifford algebra is itself a Clifford algebra^ . If Vis the orthogonal direct sum of
a vector a of norm Q(a) and a subspace U, then C£°(V,Q) is isomorphic to C£{U-Q{a)Q), where -Q(a)Q is the form
Q restricted to U and multiplied by -Q{a). In particular over the reals this implies that
C£° p q (R) = C£ Pjg _i(M) for q > 0, and
Cf p q {R) ^ C^, p _!(R) for p > 0.
In the negative-definite case this gives an inclusion C€ n „ (R) C C€ n (R) which extends the sequence
0,n—l U, n
RC CCHCH0HC ...
Likewise, in the complex case, one can show that the even subalgebra of C€ (C) is isomorphic to C€ (C).
Antiautomorphisms
In addition to the automorphism a, there are two antiautomorphisms which play an important role in the analysis of
Clifford algebras. Recall that the tensor algebra T(V) comes with an antiautomorphism that reverses the order in all
products:
Vl ® v 2 <8> • • • ® v k h-> v k <g) - - - <g) v 2 <8> Vi-
Since the ideal / is invariant under this reversal, this operation descends to an antiautomorphism of C£(V,Q) called
the transpose or reversal operation, denoted by x. The transpose is an antiautomorphism: (xy) 1 = y t x t • The
transpose operation makes no use of the Z-grading so we define a second antiautomorphism by composing a and
the transpose. We call this operation Clifford conjugation denoted x
x = a{x l ) = a(xY.
Mi
Of the two antiautomorphisms, the transpose is the more fundamental.
Note that all of these operations are involutions. One can show that they act as ±1 on elements which are pure in the
Z-grading. In fact, all three operations depend only on the degree modulo 4. That is, if x is pure with degree k then
a(x) = ±x x l = ±x x = ±x
where the signs are given by the following table:
k mod 4
0
1
2
3
a(x)
+
+
(-1)"
+
+
^_^k(k-l)/2
X
+
+
,_^k(k+l)/2
Clifford algebra
118
The Clifford scalar product
When the characteristic is not 2 the quadratic form Q on V can be extended to a quadratic form on all of C£(V,Q) as
explained earlier (which we also denoted by Q). A basis independent definition is
Q{x) = {x*x)
where <a> denotes the scalar part of a (the grade 0 part in the Z-grading). One can show that
Q(viv 2 "'Vk) = Q{vi)Q{v 2 ) • • • Q(v k )
where the v. are elements of V — this identity is not true for arbitrary elements of C£(V,Q).
The associated symmetric bilinear form on C^(V,g) is given by
(x,y) = (xty.
One can check that this reduces to the original bilinear form when restricted to V. The bilinear form on all of
C£(V,Q) is nondegenerate if and only if it is nondegenerate on V.
It is not hard to verify that the transpose is the adjoint of left/right Clifford multiplication with respect to this inner
product. That is,
(ax,y) = (z, a*y),and
(xa,y) = {x.ya 1 ).
Structure of Clifford algebras
In this section we assume that the vector space V is finite dimensional and that the bilinear form of Q is non-singular.
A central simple algebra over K is a matrix algebra over a (finite dimensional) division algebra with center K. For
example, the central simple algebras over the reals are matrix algebras over either the reals or the quaternions.
• If V has even dimension then C£(V,Q) is a central simple algebra over K.
• If V has even dimension then C£°(V,Q) is a central simple algebra over a quadratic extension of K or a sum of two
isomorphic central simple algebras over K.
• If V has odd dimension then C£(V,Q) is a central simple algebra over a quadratic extension of K or a sum of two
isomorphic central simple algebras over K.
• If V has odd dimension then C£°(V,Q) is a central simple algebra over K.
The structure of Clifford algebras can be worked out explicitly using the following result. Suppose that U has even
dimension and a non- singular bilinear form with discriminant d, and suppose that V is another vector space with a
quadratic form. The Clifford algebra of U+V is isomorphic to the tensor product of the Clifford algebras of U and
(-l) dim ^ /2 JV, which is the space V with its quadratic form multiplied by (-l) dim ^ /2 J. Over the reals, this implies
in particular that
Cl p+2 , q (R) = M 2 (E) <g> CZ, iP (R)
Cl p+1 , q+1 {R) = M 2 (R) ® Cl p<q {R)
Cl Piq+2 (R)=W®Cl qtP (R).
These formulas can be used to find the structure of all real Clifford algebras; see the classification of Clifford
algebras.
Notably, the Morita equivalence class of a Clifford algebra (its representation theory: the equivalence class of the
category of modules over it) depends only on the signature p — qmod 8. This is an algebraic form of Bott
periodicity.
Clifford algebra
119
The Clifford group T
In this section we assume that V is finite dimensional and the quadratic form Q is nondegenerate.
The invertible elements of the Clifford algebra act on it by twisted conjugation: conjugation by x maps
y i— » xya(x) -1 .
The Clifford group T is defined to be the set of invertible elements x that stabilize vectors, meaning that
xva{x)~ l G V
for all v in V.
This formula also defines an action of the Clifford group on the vector space V that preserves the norm Q, and so
gives a homomorphism from the Clifford group to the orthogonal group. The Clifford group contains all elements r
of V of nonzero norm, and these act on V by the corresponding reflections that take v to v - <v,r>r/Q(r) (In
characteristic 2 these are called orthogonal trans vections rather than reflections.)
The Clifford group T is the disjoint union of two subsets T° and T 1 , where T l is the subset of elements of degree /.
The subset T° is a subgroup of index 2 in T.
If V is a finite dimensional real vector space with positive definite (or negative definite) quadratic form then the
Clifford group maps onto the orthogonal group of V with respect to the form (by the Cartan-Dieudonne theorem) and
the kernel consists of the nonzero elements of the field K. This leads to exact sequences
i->jp _>r->o v (jiQ->i 3
1 -> k* -> r° -> SO v {K) -> 1.
Over other fields or with indefinite forms, the map is not in general onto, and the failure is captured by the spinor
norm.
Spinor norm
In arbitrary characteristic, the spinor norm Q is defined on the Clifford group by
Q(x) = x l x.
It is a homomorphism from the Clifford group to the group K of non-zero elements of K. It coincides with the
quadratic form Q of V when V is identified with a subspace of the Clifford algebra. Several authors define the spinor
norm slightly differently, so that it differs from the one here by a factor of -1, 2, or -2 on T 1 . The difference is not
very important in characteristic other than 2.
The nonzero elements of K have spinor norm in the group K 2 of squares of nonzero elements of the field K. So
when V is finite dimensional and non-singular we get an induced map from the orthogonal group of V to the group
KIK 2 , also called the spinor norm. The spinor norm of the reflection of a vector r has image Q(r) in K IK 2 , and
this property uniquely defines it on the orthogonal group. This gives exact sequences:
1 -> {±1} -> Pm v {K) -> O v {K) -> IT/IT 2 ,
1 -> {±1} -> Spin y (K) -> SO v {K) -> K*/K* 2 .
Note that in characteristic 2 the group {±1 } has just one element.
From the point of view of Galois cohomology of algebraic groups, the spinor norm is a connecting homomorphism
on cohomology. Writing \x 2 for the algebraic group of square roots of 1 (over a field of characteristic not 2 it is
roughly the same as a two-element group with trivial Galois action), the short exact sequence
1 — > fjb 2 — > Piny — > O v —> 1
yields a long exact sequence on cohomology, which begins
1 -> i/°0i 2 ; if) -> #°(Pm y ; K) -» tf°(CV; tf) -> ^(WJ *0-
Clifford algebra
120
The 0th Galois cohomology group of an algebraic group with coefficients in K is just the group of K- valued points:
H°(G] K) = G(K) , and i? 1 (/i 2 ; K) = K*/K* 2 , which recovers the previous sequence
1 -> {±1} -> Pm v {K) -> Ov(ff) -> K*/K* 2 ,
where the spinor norm is the connecting homomorphism H°(Oy] K) — » i? 1 (/i2j K).
Spin and Pin groups
In this section we assume that V is finite dimensional and its bilinear form is non-singular. (If K has characteristic 2
this implies that the dimension of Vis even.)
The Pin group Pin^(K) is the subgroup of the Clifford group T of elements of spinor norm 1 , and similarly the Spin
group Spiriy(K) is the subgroup of elements of Dickson invariant 0 in Pin^iK). When the characteristic is not 2, these
are the elements of determinant 1. The Spin group usually has index 2 in the Pin group.
Recall from the previous section that there is a homomorphism from the Clifford group onto the orthogonal group.
We define the special orthogonal group to be the image of T 0 . If K does not have characteristic 2 this is just the
group of elements of the orthogonal group of determinant 1. If K does have characteristic 2, then all elements of the
orthogonal group have determinant 1 , and the special orthogonal group is the set of elements of Dickson invariant 0.
There is a homomorphism from the Pin group to the orthogonal group. The image consists of the elements of spinor
norm 1 £ K IK . The kernel consists of the elements +1 and -1, and has order 2 unless K has characteristic 2.
Similarly there is a homomorphism from the Spin group to the special orthogonal group of V.
In the common case when V is a positive or negative definite space over the reals, the spin group maps onto the
special orthogonal group, and is simply connected when V has dimension at least 3. Further the kernel of this
homomorphism consists of 1 and -l.So in this case the spin group, Spin(n), is a double cover of SO(n). Please note,
however, that the simple connectedness of the spin group is not true in general: if V is R p,q for p and q both at least 2
then the spin group is not simply connected. In this case the algebraic group Spin is simply connected as an
algebraic group, even though its group of real valued points Spin (R) is not simply connected. This is a rather
subtle point, which completely confused the authors of at least one standard book about spin groups.
Spinors
Clifford algebras CD (C), with p+q=2n even, are matrix algebras which have a complex representation of
n ^
dimension 2 . By restricting to the group Pin^ (R) we get a complex representation of the Pin group of the same
dimension, called the spin representation. If we restrict this to the spin group Spin (R) then it splits as the sum of
n-l
two half spin representations (or Weyl representations) of dimension 2 .
If p+q=2n+l is odd then the Clifford algebra CD (C) is a sum of two matrix algebras, each of which has a
representation of dimension 2 n , and these are also both representations of the Pin group Pin^ J(JZ). On restriction to
the spin group Spin^^R) these become isomorphic, so the spin group has a complex spinor representation of
dimension 2 n .
More generally, spinor groups and pin groups over any field have similar representations whose exact structure
depends on the structure of the corresponding Clifford algebras: whenever a Clifford algebra has a factor that is a
matrix algebra over some division algebra, we get a corresponding representation of the pin and spin groups over
that division algebra. For examples over the reals see the article on spinors.
Clifford algebra
121
Real spinors
To describe the real spin representations, one must know how the spin group sits inside its Clifford algebra. The Pin
group, Pin is the set of invertible elements in CI which can be written as a product of unit vectors:
p.q p.q
Pin P:q = {viv 2 . . . v r \ Vi ? || ^|| = ±1}.
Comparing with the above concrete realizations of the Clifford algebras, the Pin group corresponds to the products
of arbitrarily many reflections: it is a cover of the full orthogonal group 0(p,q). The Spin group consists of those
elements of Pin which are products of an even number of unit vectors. Thus by the Cartan-Dieudonne theorem
p,q
Spin is a cover of the group of proper rotations SO(p,q).
Let a : CI —> CI be the automorphism which is given by -Id acting on pure vectors. Then in particular, Spin^ ^ is the
subgroup of Pin whose elements are fixed by a. Let
PA
Cl° Piq = {x€Cl p , q \a(x)=x}.
(These are precisely the elements of even degree in CI .) Then the spin group lies within
The irreducible representations of C£^ ^ restrict to give representations of the pin group. Conversely, since the pin
group is generated by unit vectors, all of its irreducible representation are induced in this manner. Thus the two
representations coincide. For the same reasons, the irreducible representations of the spin coincide with the
irreducible representations of
PA
To classify the pin representations, one need only appeal to the classification of Clifford algebras. To find the spin
representations (which are representations of the even subalgebra), one can first make use of either of the
isomorphisms (see above)
C£° 1 Jorq>0
PA P,q-1
C£° 1 ,for/7>0
p.q q>p-i
and realize a spin representation in signature (p,q) as a pin representation in either signature (p,q-l) or (q,p-l).
Applications
Differential geometry
One of the principal applications of the exterior algebra is in differential geometry where it is used to define the
bundle of differential forms on a smooth manifold. In the case of a (pseudo-)Riemannian manifold, the tangent
spaces come equipped with a natural quadratic form induced by the metric. Thus, one can define a Clifford bundle in
analogy with the exterior bundle. This has a number of important applications in Riemannian geometry. Perhaps
more importantly is the link to a spin manifold, its associated spinor bundle and spin c manifolds.
Physics
Clifford algebras have numerous important applications in physics. Physicists usually consider a Clifford algebra to
be an algebra spanned by matrices y^,...,y called Dirac matrices which have the property that
Hlj + liH = 2r lij
where r| is the matrix of a quadratic form of signature (p,q) — typically (1,3) when working in Minkowski space.
These are exactly the defining relations for the Clifford algebra Cl^ 3 (C) (up to an unimportant factor of 2), which by
the classification of Clifford algebras is isomorphic to the algebra of 4 by 4 complex matrices.
The Dirac matrices were first written down by Paul Dirac when he was trying to write a relativistic first-order wave
equation for the electron, and give an explicit isomorphism from the Clifford algebra to the algebra of complex
matrices. The result was used to define the Dirac equation and introduce the Dirac operator. The entire Clifford
algebra shows up in quantum field theory in the form of Dirac field bilinears.
Clifford algebra
122
Computer Vision
Recently, Clifford algebras have been applied in the problem of action recognition and classification in computer
vision. Rodriguez et al. ^ propose a Clifford embedding to generalize traditional MACH filters to video (3D
spatiotemporal volume), and vector-valued data such as optical flow. Vector-valued data is analyzed using the
Clifford Fourier transform. Based on these vectors action filters are synthesized in the Clifford Fourier domain and
recognition of actions is performed using Clifford Correlation. The authors demonstrate the effectiveness of the
Clifford embedding by recognizing actions typically performed in classic feature films and sports broadcast
television.
Notes
[1] Mathematicians who work with real Clifford algebras and prefer positive definite quadratic forms (especially those working in index theory)
2
sometimes use a different choice of sign in the fundamental Clifford identity. That is, they take v = -Q(y). One must replace Q with -Q in
going from one convention to the other.
[2] Thus the group algebra K[Z/2] is semisimple and the Clifford algebra splits into eigenspaces of the main involution.
[3] We are still assuming that the characteristic is not 2.
[4] The opposite is true when using the alternate (-) sign convention for Clifford algebras: it is the conjugate which is more important. In general,
the meanings of conjugation and transpose are interchanged when passing from one sign convention to the other. For example, in the
convention used here the inverse of a vector is given by y~ ^ = j Q(v) while in the (-) convention it is given by
v- 1 =v/Q(v).
[5] Rodriguez, Mikel; Shah, M (2008). "Action MACH: A Spatio-Temporal Maximum Average Correlation Height Filter for Action
Classification". Computer Vision and Pattern Recognition (CVPR).
References
• Bourbaki, Nicolas (1988), Algebra, Berlin, New York: Springer- Verlag, ISBN 978-3-540-19373-9, section XI.9.
• Carnahan, S. Borcherds Seminar Notes, Uncut. Week 5, "Spinors and Clifford Algebras".
• Lawson, H. Blaine; Michelsohn, Marie-Louise (1989), Spin Geometry, Princeton, NJ: Princeton University Press,
ISBN 978-0-691-08542-5. An advanced textbook on Clifford algebras and their applications to differential
geometry.
• Lounesto, Pertti (2001), Clifford algebras and spinors, Cambridge: Cambridge University Press,
ISBN 978-0-521-00551-7
• Porteous, Ian R. (1995), Clifford algebras and the classical groups, Cambridge: Cambridge University Press,
ISBN 978-0-521-55177-9
External links
• Planetmath entry on Clifford algebras (http://planetmath.org/encyclopedia/CliffordAlgebra2.html)
• A history of Clifford algebras (http://members.fortunecity.com/jonhays/clifhistory.htm) (unverified)
• John Baez on Clifford algebras (http://www.math.ucr.edu/home/baez/octonions/node6.html)
Distributions
123
Distributions
In mathematical analysis, distributions (or generalized functions) are objects that generalize functions.
Distributions make it possible to differentiate functions whose derivatives do not exist in the classical sense. In
particular, any locally integrable function has a distributional derivative. Distributions are widely used to formulate
generalized solutions of partial differential equations. Where a classical solution may not exist or be very difficult to
establish, a distribution solution to a differential equation is often much easier. Distributions are also important in
physics and engineering where many problems naturally lead to differential equations whose solutions or initial
conditions are distributions, such as the Dirac delta distribution.
Generalized functions were introduced by Sergei Sobolev in 1935. They were re-introduced in the late 1940s by
Laurent Schwartz, who developed a comprehensive theory of distributions.
Basic idea
A
-1 1
A typical test function, the bump function It is smooth (infinitely
differentiable) and has compact support (is zero outside an interval, in this
case the interval [-1, 1]).
Distributions are a class of linear functionals that map a set of test functions (conventional and well-behaved
functions) onto the set of real numbers. In the simplest case, the set of test functions considered is D(R), which is the
set of functions from R to R having two properties:
• The function is smooth (infinitely differentiable);
• The function has compact support (is identically zero outside some interval).
Then, a distribution d is a mapping from D(R) to R. Instead of writing d( <fi ), where ^ is a test function in D(R), it
is conventional to write (d, (fi) . A simple example of a distribution is the Dirac delta 6, defined by
%>) = {6,<p)=<p(0).
There are straightforward mappings from both locally integrable functions and probability distributions to
corresponding distributions, as discussed below. However, not all distributions can be formed in this manner.
Suppose that
/:R^R
is a locally integrable function, and let
if : R^R
be a test function in D(R). We can then define a corresponding distribution T^.by:
(T f ,<p) = [ ftpdx.
This integral is a real number which linearly and continuously depends on <fi . This suggests the requirement that a
distribution should be linear and continuous over the space of test functions D(R), which completes the definition. In
a conventional abuse of notation, / may be used to represent both the original function / and the distribution T^,
derived from it.
Distributions
124
Similarly, if P is a probability distribution on the reals and <p is a test function, then a corresponding distribution
may be defined by:
(T P ,<p) = [ ipdP
Jn
Again, this integral continuously and linearly depends on ^ , so that is in fact a distribution.
Such distributions may be multiplied with real numbers and can be added together, so they form a real vector space.
In general it is not possible to define a multiplication for distributions, but distributions may be multiplied with
infinitely differentiable functions.
It's desirable to choose a definition for the derivative of a distribution which, at least for distributions derived from
locally integrable functions, has the property that (T^)' = T . If ^ is a test function, we can show that
(T f ,,<p)= [ f'<pdx=- [ f<p'dx = -(T f ,<p')
Jn Jn
using integration by parts and noting that [f , {^)^p{^)]°^ QO — 0, since <^is zero outside of a bounded set. This
suggests that if S is a distribution, we should define its derivative S' by
" (S',<p) = -(S, ( p').
It turns out that this is the proper definition; it extends the ordinary definition of derivative, every distribution
becomes infinitely differentiable and the usual properties of derivatives hold.
Example: Recall that the Dirac delta (so-called Dirac delta function) is the distribution defined by
(6, ip) = <p(0)
It is the derivative of the distribution corresponding to the Heaviside step function H: For any test function ip ,
/oc roc
H(x)<p'(x)dx = - / <p'(x)dx = p(0)-<p(oo) = <p{0) = (6, if) ,
-oc J0
so (Tjy )' = S . Note, (f(oo) = 0 because of compact support. Similarly, the derivative of the Dirac delta is the
distribution
= V(o).
This latter distribution is our first example of a distribution which is derived from neither a function nor a probability
distribution.
Test functions and distributions
In the sequel, real-valued distributions on an open subset U of will be formally defined. With minor
modifications, one can also define complex-valued distributions, and one can replace by any (paracompact)
smooth manifold.
The first object to define is the space D(U) of test functions on U. Once this is defined, it is then necessary to equip it
with a topology by defining the limit of a sequence of elements of D(£/). The space of distributions will then be
given as the space of continuous linear functionals on D(t/).
Test function space
The space D(U) of test functions on U is defined as follows. A function <p : U — » R is said to have compact support
if there exists a compact subset K of U such that <p(x) = 0 for all x in U \ K. The elements of D(U) are the infinitely
differentiable functions : U —> R with compact support — also known as bump functions. This is a real vector
space. It can be given a topology by defining the limit of a sequence of elements of D(t/). A sequence ( ) in
D(U) is said to converge to <p G D(U) if the following two conditions hold (Gelfand & Shilov 1966-1968, v. 1,
§1.2):
Distributions
125
• There is a compact set K C U containing the supports of all <p ^:
[Jsupp(^ fc ) C K.
k
• For each multiindex a, the sequence of partial derivatives D a <p k tends uniformly to D a <fi .
With this definition, D(U) becomes a complete locally convex topological vector space satisfying the Heine-Borel
property (Rudin 1991, §6.4-5). If U is a countable nested family of open subsets of U with compact closures Ki=Ui
, then
D(U) = \jD Ki
i
where D / is the set of all smooth functions with support lying in K.. The topology on D(U) is the final topology of
K i
the family of nested metric spaces D^z and so D(t/) is an LF-space. The topology is not metrizable by the Baire
category theorem, since D(U) is the union of subspaces of the first category in D(U) (Rudin 1991, §6.9).
Distributions
A distribution on U is a linear functional S : D(t/) -> R (or S : D(U) -> C), such that
lim S(tp n ) = S I lim tp n )
for any convergent sequence <p ^ in D(U). The space of all distributions on U is denoted by D'(t/). Equivalently, the
vector space D'(£/) is the continuous dual space of the topological vector space D(t/).
The dual pairing between a distribution S in D'(£/) and a test function ifi in D(£/) is denoted using angle brackets
thus:
D'(U) x B(U) 3 (S, if) h-> (5, if) £ R.
Equipped with the weak-* topology, the space D\U) is a locally convex topological vector space. In particular, a
sequence (Sp in D'(t/) converges to a distribution S if and only if
(S k ,<p) ^ {S,<p)
for all test functions ip . This is the case if and only if converges uniformly to S on all bounded subsets of D(t/).
(A subset E of D(U) is bounded if there exists a compact subset K of U and numbers such that every (p in £ has
its support in and has its ft-th derivatives bounded by d^)
Functions as distributions
The function / : U —> R is called locally integrable if it is Lebesgue integrable over every compact subset K of U.
This is a large class of functions which includes all continuous functions and all if functions. The topology on D(t/)
is defined in such a fashion that any locally integrable function / yields a continuous linear functional on D([/) —
that is, an element of D'(U) — denoted here by 7^, whose value on the test function <^is given by the Lebesgue
integral:
(T fz ip)= [ ftpdx.
Ju
Conventionally, one abuses notation by identifying 7^ with j\ provided no confusion can arise, and thus the pairing
between / and <p is often written
(f,<p) = (T f ,<p)-
If / and g are two locally integrable functions, then the associated distributions 7^ and 7^ are equal to the same
element of D\U) if and only if / and g are equal almost everywhere (see, for instance, Hormander (1983, Theorem
1.2.5)). In a similar manner, every Radon measure \x on U defines an element of D\U) whose value on the test
function ^ is J (fd\i. As above, it is conventional to abuse notation and write the pairing between a Radon
Distributions
126
measure \x and a test function <fi as (fi,(p) . Conversely, essentially by the Riesz representation theorem, every distribution
which is non-negative on non-negative functions is of this form for some (positive) Radon measure.
The test functions are themselves locally integrable, and so define distributions. As such they are dense in D\U) with
respect to the topology on D'(£/) in the sense that for any distribution S G D'(£/), there is a sequence n ^ D(£/)
such that
fa,*) -> (M)
for all ip ED(U). This follows at once from the Hahn— Banach theorem, since by an elementary fact about weak
topologies the dual of D'(t/) with its weak-* topology is the space D(U) (Rudin 1991, Theorem 3.10). This can also
be proven more constructively by a convolution argument.
Operations on distributions
Many operations which are defined on smooth functions with compact support can also be defined for distributions.
In general,,, if
T : D(U) -> D(U)
is a linear mapping of vector spaces which is continuous with respect to the weak-* topology, then it is possible to
extend T to a mapping
T : D'(U) -> D'(U)
by passing to the limit. (This approach works for more general non-linear mappings as well, provided they are
assumed to be uniformly continuous.)
In practice, however, it is more convenient to define operations on distributions by means of the transpose (or adjoint
transformation) (Strichartz 1994, §2.3; Treves 1967). If T : D(t/) — » D(£/) is a continuous linear operator, then the
transpose is an operator T :D(U)—>D(U) such that
(T<p,ip) = {ip,T*ip)
for all <p , ip G D(£/). If such an operator T exists, and is continuous, then the original operator T may be extended
to distributions by defining
Differentiation
If T : D(U) — » D(U) is given by the partial derivative
T„ = |^.
dx k
By integration by parts, if f and 1]) are in D(U), then
so that T = -T. This is a continuous linear transformation D(U) —> D(U). So, if S £ D\U) is a distribution, then the
partial derivative of S with respect to the coordinate is defined by the formula
\ dxk I \ dxk J
for all test functions . In this way, every distribution is infinitely differentiable, and the derivative in the direction
x k is a linear operator on D'(U). In general, if a = (a^ a ) is an arbitrary multi-index and d a denotes the
associated mixed partial derivative operator, the mixed partial derivative d a S of the distribution S E D'(£/) is defined
by
{d"S, if) = (-l)W {S, &*ip) for aU <p G D(C7).
Distributions
127
Differentiation of distributions is a continuous operator on U(U); this is an important and desirable property that is
not shared by most other notions of differentiation.
Multiplication by a smooth function
If m : U — > R is an infinitely differentiable function and S is a distribution on U, then the product mS is defined by
(mS)( ( fi) = S(m ) for all test functions tp . This definition coincides with the transpose transformation of
T m : if h-> rmp
for (p GD(U). Then, for any test function ip
(T m ip,i>) = f m(x)<p(x)i>(x)dx = (ip,T m i>)
J u
so that T =T . Multiplication of a distribution S by the smooth function m is therefore defined by
mm
mSfy) = (mS. ip) = (£, rmj)) = S(mip).
Under multiplication by smooth functions, D\U) is a module over the ring C°°(D). With this definition of
multiplication by a smooth function, the ordinary product rule of calculus remains valid. However, a number of
unusual identities also arise. For example, the Dirac delta distribution 6 is defined on R by (6, tp) = *P (OX an d
its derivative is given by (6', <p ) =- (S, <fi') =- *P '(0)- However, the product m6' is the distribution
mS' = m(0)5' - m!8.
This definition of multiplication also makes it possible to define the operation of a linear differential operator with
smooth coefficients on a distribution. A linear differential operator takes a distribution S G T>\U) to another
distribution given by a sum of the form
ps = p^ s
\a\<k
where the coefficients are smooth functions in U. If P is a given differential operator, then the minimum integer k
for which such an expansion holds for every distribution S is called the order of P. The transpose of P is given by
The space D\U) is a D-module with respect to the action of the ring of linear differential operators.
Composition with a smooth function
Let S be a distribution on an open set U C R^. Let V be an open set in R n , and F : V —> U. Then provided F is a
submersion, it is possible to define
SoF£D'(V).
This is the composition of the distribution S with F, and is also called the pullback of S along F, sometimes written
F* : S i-> F*S = S o F.
The pullback is often denoted F , but this notation risks confusion with the above use of '*' to denote the transpose of
a linear mapping.
The condition that F be a submersion is equivalent to the requirement that the Jacobian derivative dF(x) of F is a
surjective linear map for every x E V. A necessary (but not sufficient) condition for extending F # to distributions is
that F be an open mapping (Hormander 1983, Theorem 6.1.1). The inverse function theorem ensures that a
submersion satisfies this condition.
If F is a submersion, then F # is defined on distributions by finding the transpose map. Uniqueness of this extension is
guaranteed since F # is a continuous linear operator on D(£/). Existence, however, requires using the change of
variables formula, the inverse function theorem (locally) and a partition of unity argument; see Hormander (1983,
Distributions
128
Theorem 6.1.2).
In the special case when F is a diffeomorphism from an open subset V of onto an open subset U of change of
variables under the integral gives
J ip o F{x)ifj{x) dx = J if{x)^{F- l {x))\detdF- 1 {x)\dx.
In this particular case, then, F is defined by the transpose formula:
{F^S.if) = (S^detdiF-^lipoF- 1 ).
Localization of distributions
There is no way to define the value of a distribution in D\U) at a particular point of U. However, as is the case with
functions, distributions on U restrict to give distributions on open subsets of U. Furthermore, distributions are locally
determined in the sense that a distribution on all of U can be assembled from a distribution on an open cover of U
satisfying some compatibility conditions on the overlap. Such a structure is known as a sheaf.
Restriction
Let U and V be open subsets of R n with VCU. Let E : D(V) —>D(U) be the operator which extends by zero a
given smooth function compactly supported in V to a smooth function compactly supported in the larger set U. Then
the restriction mapping p ^is defined to be the transpose of E yu . Thus for any distribution S G D\U), the restriction
p S is a distribution in the dual space D'( V) defined by
(pvuS, if) = (S, E VU ip)
for all test functions ip G D(V).
Unless U=V, the restriction to V is neither injective nor surjective. Lack of surjectivity follows since distributions
can blow up towards the boundary of V. For instance, if U = R and V = (0,2), then the distribution
S(x) = f^nS(x-^j
is in D'(V) but admits no extension to D\U).
Support of a distribution
Let S £ D'(U) be a distribution on an open set U. Then S is said to vanish on an open set V of U if S lies in the kernel
of the restriction map p^. Explicitly S vanishes on V if
(5, = 0
for all test functions G C°°(U) with support in V. Let V be a maximal open set on which the distribution S
vanishes; i.e., Vis the union of every open set on which S vanishes. The support of S is the complement of V in U.
Thus
su PP 5 = C/-|J{^|^ [ /5 = 0}.
The distribution S has compact support if its support is a compact set. Explicitly, S has compact support if there is a
compact subset K of U such that for every test function <p whose support is completely outside of K, we have S( <p
) = 0. Compactly supported distributions define continuous linear functions on the space C°°(U) m , the topology on
C°°(U) is defined such that a sequence of test functions <fi k converges to 0 if and only if all derivatives of ¥ k
converge uniformly to 0 on every compact subset of U. Conversely, it can be shown that every continuous linear
functional on this space defines a distribution of compact support.
Distributions
129
Tempered distributions and Fourier transform
By using a larger space of test functions, one can define the tempered distributions, a subspace of D'(R W ). These
distributions are useful if one studies the Fourier transform in generality: all tempered distributions have a Fourier
transform, but not all distributions have one.
The space of test functions employed here, the so-called Schwartz space S(R n ), is the function space of all infinitely
differentiable functions that are rapidly decreasing at infinity along with all partial derivatives. Thus <fi : — » R is
in the Schwartz space provided that any derivative of <p , multiplied with any power of bcl, converges towards 0 for
\x\—>oo. These functions form a complete topological vector space with a suitably defined family of seminorms.
More precisely, let
p atP (<p) = sup \x a D^(x)\
for a, (3 multi-indices of size n. Then <fi is a Schwartz function if all the values
p a A<p) < °°-
The family of seminorms ^ defines a locally convex topology on the Schwartz- space. The seminorms are, in fact,
norms on the Schwartz space, since Schwartz functions are smooth. The Schwartz space is metrizable and complete.
The space of tempered distributions is defined as the (continuous) dual of the Schwartz space. In other words, a
distribution F is a tempered distribution if and only if
lim F(<p m ) = 0.
is true whenever,
holds for all multi-indices a, (3.
The derivative of a tempered distribution is again a tempered distribution. Tempered distributions generalize the
bounded (or slow-growing) locally integrable functions; all distributions with compact support and all
square-integrable functions are tempered distributions. All locally integrable functions / with at most polynomial
growth, i.e. such that/(x) = 0(lxl r ) for some r, are tempered distributions. This includes all functions in L p (R n ) for
P >\.
The tempered distributions can also be characterized as slowly growing. This characterization is dual to the rapidly
falling behaviour, e.g. oc • exp(— x 2 ) , of the test functions.
To study the Fourier transform, it is best to consider complex-valued test functions and complex-linear distributions.
The ordinary continuous Fourier transform F yields then an automorphism of Schwartz function space, and we can
define the Fourier transform of the tempered distribution S by (FS)(\|0 = S(Fty) for every test function ip. FS is
thus again a tempered distribution. The Fourier transform is a continuous, linear, bijective operator from the space of
tempered distributions to itself. This operation is compatible with differentiation in the sense that
dS .
F— = ixFS
dx
and also with convolution: if S is a tempered distribution and ip is a slowly increasing infinitely differentiable
function on R^ (meaning that all derivatives of grow at most as fast as polynomials), then Sty is again a tempered
distribution and
F(SV) = FS*FiP
is the convolution of FS and Fty.
Distributions
130
Convolution
Under some circumstances, it is possible to define the convolution of a function with a distribution, or even the
convolution of two distributions.
Convolution of a test function with a distribution
If/G D(R W ) is a compactly supported smooth test function, then convolution with/defines an operator
C f : D(R n ) -> D(R n )
defined by =f*g, which is linear (and continuous with respect to the LF space topology on D(R W ).)
Convolution of / with a distribution S £ D'(R W ) can be defined by taking the transpose of relative to the duality
pairing of D(R n ) with the space D'(R n ) of distributions (Treves 1967, Chapter 27). If/, g, if G D(R n ), then by
Fubini's theorem
(C f g, <p)= <p{x) / f(x - y)g(y) dydx = {g, Cpp)
where f(x)=f(-x) . Extending by continuity, the convolution off with a distribution S is defined by
(f*S,<p) = {Sj*<p)
for all test functions <p G D(R").
An alternative way to define the convolution of a function /and a distribution S is to use the translation operator
defined on test functions by
r x tp(y) = tp(y - x)
and extended by the transpose to distributions in the obvious way (Rudin 1991, §6.29). The convolution of the
compactly supported function /and the distribution S is then the function defined for each x G R n by
(f*S)(x) = {S,rJ).
It can be shown that the convolution of a compactly supported function and a distribution is a smooth function. If the
distribution S has compact support as well, then f*S is a compactly supported function, and the Titchmarsh
convolution theorem (Hormander 1983, Theorem 4.3.3) implies that
ch(/ * S) = chf + chS
where ch denotes the convex hull.
Distribution of compact support
It is also possible to define the convolution of two distributions S and T on R^, provided one of them has compact
support. Informally, in order to define S*T where T has compact support, the idea is to extend the definition of the
convolution * to a linear operation on distributions so that the associativity formula
S * (T * (p) = (S *T) * (f
continues to hold for all test-functions ip . Hormander (1983, §IV.2) proves the uniqueness of such an extension.
It is also possible to provide a more explicit characterization of the convolution of distributions (Treves 1967,
Chapter 27). Suppose that it is Tthat has compact support. For any test function <fi in D(R W ), consider the function
1>(x) = {T,T- x <p).
It can be readily shown that this defines a smooth function of x, which moreover has compact support. The
convolution of S and T is defined by
(S*T,<p) = (S,iP).
This generalizes the classical notion of convolution of functions and is compatible with differentiation in the
following sense:
&*{S *T) = (&*S) *T = S* (d a T).
Distributions
131
This definition of convolution remains valid under less restrictive assumptions about S and T\ see for instance
Gel'fand & Shilov (1966-1968, v. 1, pp. 103-104) and Benedetto (1997, Definition 2.5.8).
Distributions as derivatives of continuous functions
The formal definition of distributions exhibits them as a subspace of a very large space, namely the algebraic dual of
D([/) (or S(R^) for tempered distributions). It is not immediately clear from the definition how exotic a distribution
might be. To answer this question, it is instructive to see distributions built up from a smaller space, namely the
space of continuous functions. Roughly, any distribution is locally a (multiple) derivative of a continuous function. A
precise version of this result, given below, holds for distributions of compact support, tempered distributions, and
general distributions. Generally speaking, no proper subset of the space of distributions contains all continuous
functions and is closed under differentiation. This says that distributions are not particularly exotic objects; they are
only as complicated as necessary.
Tempered distributions
If / G S'(R n ) is a tempered distribution, then there exists a constant C > 0, and positive integers M and N such that for
all Schwartz functions <f eS(R n )
(fM<c £ su p = c E p«A<p)-
\ct\<N,\p\<M xeUn \<x\<N : \P\<M
This estimate along with some techniques from functional analysis can be used to show that there is a continuous
slowly increasing function F and a multiindex a such that
/ = D a F.
Compactly supported distributions
Let U be an open set, and K a compact subset of U. If / is a distribution supported on K, then there is a continuous
function F compactly supported in U (possibly on a larger set than K itself) such that
f = D a F
for some multi-index a. This follows from the previously quoted result on tempered distributions by means of a
localization argument.
Distributions with point support
If / has support at a single point {P}, then / is in fact a finite linear combination of distributional derivatives of the 6
function at P. That is, there exists an integer m and complex constants a for multi indices loci < m such that
/= £ o, a D a {r P 5)
|c*|<m
where x p is the translation operator.
General distributions
A version of the above theorem holds locally in the following sense (Rudin 1991). Let S be a distribution on U. Then
one can find for every multi-index a a continuous function g^ such that
S = J2D a 9a
a
and that any compact subset K of U intersects the supports of only finitely many g^; therefore, to evaluate the value
of S for a given smooth function / compactly supported in U, we only need finitely many g^ hence the infinite sum
above is well-defined as a distribution. If the distribution S is of finite order, then one can choose g^ in such a way
that only finitely many of them are nonzero.
Distributions
132
Using holomorphic functions as test functions
The success of the theory led to investigation of the idea of hyperf unction, in which spaces of holomorphic functions
are used as test functions. A refined theory has been developed, in particular Mikio Sato's algebraic analysis, using
sheaf theory and several complex variables. This extends the range of symbolic methods that can be made into
rigorous mathematics, for example Feynman integrals.
Problem of multiplication
A possible limitation of the theory of distributions (and hyperf unctions) is that it is a purely linear theory, in the
sense that the product of two distributions cannot consistently be defined (in general), as has been proved by Laurent
Schwartz in the 1950s. For example, if p.v. l/x is the distribution obtained by the Cauchy principal value
(p.v±)[<t>}= lim / ^-dx
for all <fi E S(R), and 6 is the Dirac delta distribution then
(5 X x) X p.v.- = 0
x
but
S X (^x X p.v.-^j = S
so the product of a distribution by a smooth function (which is always well defined) cannot be extended to an
associative product on the space of distributions.
Thus, nonlinear problems cannot be posed in general and thus not solved within distribution theory alone. In the
context of quantum field theory, however, solutions can be found. In more than two spacetime dimensions the
problem is related to the regularization of divergences. Here Henri Epstein and Vladimir Glaser developed the
mathematically rigorous (but extremely technical) causal perturbation theory. This does not solve the problem in
other situations. Many other interesting theories are non linear, like for example Navier-Stokes equations of fluid
dynamics.
In view of this, several not entirely satisfactory theories of algebras of generalized functions have been developed,
among which Colombeau's (simplified) algebra is maybe the most popular in use today.
A simple solution of the multiplication problem is dictated by the path integral formulation of quantum mechanics.
Since this is required to be equivalent to the Schrodinger theory of quantum mechanics which is invariant under
coordinate transformations, this property must be shared by path integrals. This fixes all products of distributions as
shown by Kleinert & Chervyakov (2001) The result is equivalent to what can be derived from dimensional
regularization (Kleinert & Chervyakov 2000).
References
• Benedetto, J.J. (1997), Harmonic Analysis and Applications, CRC Press.
• Gel'fand, I.M.; Shilov, G.E. (1966-1968), Generalized functions, 1-5, Academic Press.
• Hormander, L. (1983), The analysis of linear partial differential operators I, Grundl. Math. Wissenschaft., 256,
Springer, MR0717035, ISBN 3-540-12104-8.
• Kleinert, H.; Chervyakov, A. (2001), "Rules for integrals over products of distributions from coordinate
independence of path integrals" Europ. Phys. J. C 19: 743-747, doi:10.1007/sl00520100600.
• Kleinert, H.; Chervyakov, A. (2000), "Coordinate Independence of Quantum-Mechanical Path Integrals" ,
Phys. Lett. A 269: 63, doi:10.1016/S0375-9601(00)00475-8.
• Rudin, W. (1991), Functional Analysis (2nd ed.), McGraw-Hill, ISBN 0-07-054236-8.
Distributions
133
• Schwartz, L. (1954), "Sur l'impossibilite de la multiplications des distributions", C.R.Acad. Sci. Paris 239:
847-848.
• Schwartz, L. (1950-1951), Theorie des distributions, 1-2, Hermann.
• Stein, Elias; Weiss, Guido (1971), Introduction to Fourier Analysis on Euclidean Spaces, Princeton University
Press, ISBN 0-691-08078-X.
• Strichartz, R. (1994), A Guide to Distribution Theory and Fourier Transforms, CRC Press, ISBN 0849382734.
• Treves, Francois (1967), Topological Vector Spaces, Distributions and Kernels, Academic Press, pp. 126 ff.
Further reading
• M. J. Lighthill (1959). Introduction to Fourier Analysis and Generalised Functions. Cambridge University Press.
ISBN 0-521-09128-4 (requires very little knowledge of analysis; defines distributions as limits of sequences of
functions under integrals)
• H. Kleinert, Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets, 4th
edition, World Scientific (Singapore, 2006) ^(also available online here ^). See Chapter 11 for defining
products of distributions from the physical requirement of coordinate invariance.
• Vladimirov, V.S. (2001), "Generalized function" ^, in Hazewinkel, Michiel, Encyclopaedia of Mathematics,
Springer, ISBN 978-1556080104.
• Vladimirov, V.S. (2001), "Generalized functions, space of" in Hazewinkel, Michiel, Encyclopaedia of
Mathematics, Springer, ISBN 978-1556080104.
T71
• Vladimirov, V.S. (2001), "Generalized function, derivative of a" L '\in Hazewinkel, Michiel, Encyclopaedia of
Mathematics, Springer, ISBN 978-1556080104.
• Vladimirov, V.S. (2001), "Generalized functions, product of" ^\ in Hazewinkel, Michiel, Encyclopaedia of
Mathematics, Springer, ISBN 978-1556080104.
• Oberguggenberger, Michael (2001), "Generalized function algebras" J , in Hazewinkel, Michiel, Encyclopaedia
of Mathematics, Springer, ISBN 978-1556080104.
References
[1] http://www.physik.fu-berlin.de/~kleinert/kleiner_re303/wardepl.pdf
[2] http://www.physik.fu-berlin.de/~kleinert/305/klch2.pdf
[3] http ://www . worldscibooks . com/phy sics/6223 . html
[4] http://www.physik.fu-berlin.de/~kleinert/b5
[5] http://eom.springer.de/G/g043810.htm
[6] http://eom. springer. de/G/g043 840.htm
[7] http://eom.springer.de/G/g043820.htm
[8] http://eom.springer.de/G/g043830.htm
[9] http://eom. springer. de/G/g 1 30030.htm
Hilbert space
134
Hilbert space
Hilbert spaces can be used to study the harmonics
of vibrating strings.
The mathematical concept of a Hilbert space, named after David
Hilbert, generalizes the notion of Euclidean space. It extends the
methods of vector algebra and calculus from the two-dimensional
Euclidean plane and three-dimensional space to spaces with any finite
or infinite number of dimensions. A Hilbert space is an abstract vector
space possessing the structure of an inner product that allows length
and angle to be measured. Furthermore, Hilbert spaces are required to
be complete, a property that stipulates the existence of enough limits in
the space to allow the techniques of calculus to be used.
Hilbert spaces arise naturally and frequently in mathematics, physics,
and engineering, typically as infinite-dimensional function spaces. The
earliest Hilbert spaces were studied from this point of view in the first
decade of the 20th century by David Hilbert, Erhard Schmidt, and
Frigyes Riesz. They are indispensable tools in the theories of partial differential equations, quantum mechanics,
Fourier analysis (which includes applications to signal processing and heat transfer) and ergodic theory which forms
the mathematical underpinning of the study of thermodynamics. John von Neumann coined the term "Hilbert space"
for the abstract concept underlying many of these diverse applications. The success of Hilbert space methods ushered
in a very fruitful era for functional analysis. Apart from the classical Euclidean spaces, examples of Hilbert spaces
include spaces of square-integrable functions, spaces of sequences, Sobolev spaces consisting of generalized
functions, and Hardy spaces of holomorphic functions.
Geometric intuition plays an important role in many aspects of Hilbert space theory. Exact analogs of the
Pythagorean theorem and parallelogram law hold in a Hilbert space. At a deeper level, perpendicular projection onto
a subspace (the analog of "dropping the altitude" of a triangle) plays a significant role in optimization problems and
other aspects of the theory. An element of a Hilbert space can be uniquely specified by its coordinates with respect to
a set of coordinate axes (an orthonormal basis), in analogy with Cartesian coordinates in the plane. When that set of
axes is countably infinite, this means that the Hilbert space can also usefully be thought of in terms of infinite
sequences that are square- summable. Linear operators on a Hilbert space are likewise fairly concrete objects: in good
cases, they are simply transformations that stretch the space by different factors in mutually perpendicular directions
in a sense that is made precise by the study of their spectral theory.
Definition and illustration
Motivating example: Euclidean space
One of the most familiar examples of a Hilbert space is the Euclidean space consisting of three-dimensional vectors,
3
denoted by R , and equipped with the dot product. The dot product takes two vectors x and y, and produces a real
number xy. If x and y are represented in Cartesian coordinates, then the dot product is defined by
(x u x 2 , x 3 ) • (y u y 2) jft) = rcij/i + x 2 y 2 + x 3 y 3 .
The dot product satisfies the properties:
1. It is symmetric in x and y: x y = y x.
2. It is linear in its first argument: (ax^ + bx^-y = ax^-y + bx^y for any scalars a, b, and vectors x^ x^ and y.
3. It is positive definite: for all vectors x, x x > 0 with equality if and only if x = 0.
Hilbert space
135
An operation on pairs of vectors that, like the dot product, satisfies these three properties is known as a (real) inner
product. A vector space equipped with such an inner product is known as a (real) inner product space. Every
finite-dimensional inner product space is also a Hilbert space. The basic feature of the dot product that connects it
with Euclidean geometry is that it is related to both the length (or norm) of a vector, denoted llxll, and to the angle 9
between two vectors x and y by means of the formula
x • y = ||x|| ||y|| cos0.
Multivariable calculus in Euclidean space relies on the ability to
compute limits, and to have useful criteria for concluding that limits
exist. A mathematical series
Completeness means that if a particle moves
along the broken path (in blue) travelling a finite
total distance, then the particle has a well-defined
net displacement (in orange).
oc
71=0
3
consisting of vectors in R is absolutely convergent provided that the sum of the lengths converges as an ordinary
series of real numbers: ^
5^||x fc || < oo.
fc=0
Just as with a series of scalars, a series of vectors that converges absolutely also converges to some limit vector L in
the Euclidean space, in the sense that
N
0 as N — > oo.
This property expresses the completeness of Euclidean space: that a series which converges absolutely also
converges in the ordinary sense.
Definition
A Hilbert space H is a real or complex inner product space that is also a complete metric space with respect to the
T21
distance function induced by the inner product. To say that H is a complex inner product space means that H is a
complex vector space on which there is an inner product (x,y) associating a complex number to each pair of elements
x,y of H that satisfies the following properties:
• (y,x) is the complex conjugate of (x,y)\
(y,x) = (x : y).
ran
• (x,y) is linear in its first argument. For all complex numbers a and b,
(ax 1 + bx 2 ,y) = a{x u y) + b(x 2 ,y).
• The inner product is positive definite:
(x,x) > 0
where the case of equality holds precisely when x = 0.
Hilbert space
136
It follows from properties 1 and 2 that a complex inner product is antilinear in its second argument, meaning that
(x,ay 1 + by 2 ) = ofoyi) + b{x,y 2 ).
A real inner product space is defined in the same way, except that H is a real vector space and the inner product takes
real values. Such an inner product will be bilinear: that is, linear in each argument.
The norm defined by the inner product (•,•) is the real- valued function
\\ x \\ = \/( x , x )i
and the distance between two points x,y in H is defined in terms of the norm by
d(x, y) = \\x - y || = ^ {x - y, x - y).
That this function is a distance function means (1) that it is symmetric in x and y, (2) that the distance between x and
itself is zero, and otherwise the distance between x and y must be positive, and (3) that the triangle inequality holds,
meaning that the length of one leg of a triangle xyz cannot exceed the sum of the lengths of the other two legs:
d(x,z) < d(x,y) + d(y 9 z).
\d(y.z)
x u -
This last property is ultimately a consequence of the more fundamental Cauchy-Schwarz inequality, which asserts
\{x,y)\ < Ml ||y||
with equality if and only if x and y are linearly dependent.
Relative to a distance function defined in this way, any inner product space is a metric space, and sometimes is
known as a pre-Hilbert space A pre-Hilbert space is a Hilbert space if in addition it is complete. Completeness is
expressed using a form of the Cauchy criterion for sequences in H: a pre-Hilbert space H is complete if every
Cauchy sequence converges with respect to this norm to an element in the space. Completeness can be characterized
by the following equivalent condition: if a series of vectors YlkLo u k converges absolutely in the sense that
oc
]T \\u k \\ < OO,
then the series converges in H, in the sense that the partial sums converge to an element of H.
As a complete normed space, Hilbert spaces are by definition also Banach spaces. As such they are topological
vector spaces, in which topological notions like the openness and closedness of subsets are well-defined. Of special
importance is the notion of a closed linear subspace of a Hilbert space which, with the inner product induced by
restriction, is also complete (being a closed set in a complete metric space) and therefore a Hilbert space in its own
right.
Hilbert space
137
Second example: sequence spaces
2
The sequence space D consists of all infinite sequences z = (z^,...) of complex numbers such that the series
oo
71=1
2
converges. The inner product on D is defined by
oo
71=1
with the latter series converging as a consequence of the Cauchy-Schwarz inequality.
2
Completeness of the space holds provided that whenever a series of elements from D converges absolutely (in norm),
2
then it converges to an element of D . The proof is basic in mathematical analysis, and permits mathematical series of
elements of the space to be manipulated with the same ease as series of complex numbers (or vectors in a
finite-dimensional Euclidean space)
History
Prior to the development of Hilbert spaces, other generalizations of
Euclidean spaces were known to mathematicians and physicists. In
particular, the idea of an abstract linear space had gained some traction
towards the end of the 19th century -J® this is a space whose elements
can be added together and multiplied by scalars (such as real or
complex numbers) without necessarily identifying these elements with
"geometric" vectors, such as position and momentum vectors in
physical systems. Other objects studied by mathematicians at the turn
of the 20th century, in particular spaces of sequences (including series)
T71
and spaces of functions, can naturally be thought of as linear spaces.
Functions, for instance, can be added together or multiplied by
constant scalars, and these operations obey the algebraic laws satisfied
by addition and scalar multiplication of spatial vectors.
In the first decade of the 20th century, parallel developments led to the
introduction of Hilbert spaces. The first of these was the observation,
which arose during David Hilbert and Erhard Schmidt's study of
rsi
integral equations, that two square-integrable real-valued functions/
and g on an interval [a,b] have an inner product
(f,9) = / f(x)g(x)dx
J a
which has many of the familiar properties of the Euclidean dot product. In particular, the idea of an orthogonal
family of functions has meaning. Schmidt exploited the similarity of this inner product with the usual dot product to
prove an analog of the spectral decomposition for an operator of the form
f b
f(x) h-> / K(x,y)f(y)dy
J a
where K is a continuous function symmetric in x and y. The resulting eigenfunction expansion expresses the function
K as a series of the form
71
Hilbert space
138
where the functions w are orthogonal in the sense that (w ,w ) = 0 for all n ± m. The individual terms in this
n n m
series are sometimes referred to as elementary product solutions. However, there are eigenfunction expansions which
fail to converge in a suitable sense to a square-integrable function: the missing ingredient, which ensures
convergence, is completeness. 1
The second development was the Lebesgue integral, an alternative to the Riemann integral introduced by Henri
Lebesgue in 1904.^ The Lebesgue integral made it possible to integrate a much broader class of functions. In 1907,
2
Frigyes Riesz and Ernst Sigismund Fischer independently proved that the space L of square Lebesgue-integrable
functions is a complete metric spaced 1 ^ As a consequence of the interplay between geometry and completeness, the
19th century results of Joseph Fourier, Friedrich Bessel and Marc-Antoine Parseval on trigonometric series easily
carried over to these more general spaces, resulting in a geometrical and analytical apparatus now usually known as
the Riesz-Fischer theorem J 12 ^
Further basic results were proved in the early 20th century. For example, the Riesz representation theorem was
ri3i
independently established by Maurice Frechet and Frigyes Riesz in 1907. John von Neumann coined the term
abstract Hilbert space in his work on unbounded Hermitian operators J 14 ^ Although other mathematicians such as
Hermann Weyl and Norbert Wiener had already studied particular Hilbert spaces in great detail, often from a
physically-motivated point of view, von Neumann gave the first complete and axiomatic treatment of themJ 15] Von
Neumann later used them in his seminal work on the foundations of quantum mechanics, ^ 16 ^ and in his continued
work with Eugene Wigner. The name "Hilbert space" was soon adopted by others, for example by Hermann Weyl in
ri7i
his book on quantum mechanics and the theory of groups.
The significance of the concept of a Hilbert space was underlined with the realization that it offers one of the best
ri8i
mathematical formulations of quantum mechanics. In short, the states of a quantum mechanical system are
vectors in a certain Hilbert space, the observables are hermitian operators on that space, the symmetries of the
system are unitary operators, and measurements are orthogonal projections. The relation between quantum
mechanical symmetries and unitary operators provided an impetus for the development of the unitary representation
ri7i
theory of groups, initiated in the 1928 work of Hermann Weyl. On the other hand, in the early 1930s it became
clear that certain properties of classical dynamical systems can be analyzed using Hilbert space techniques in the
framework of ergodic theory J
The algebra of observables in quantum mechanics is naturally an algebra of operators defined on a Hilbert space,
according to Werner Heisenberg's matrix mechanics formulation of quantum theory. Von Neumann began
investigating operator algebras in the 1930s, as rings of operators on a Hilbert space. The kind of algebras studied by
von Neumann and his contemporaries are now known as von Neumann algebras. In the 1940s, Israel Gelfand, Mark
Naimark and Irving Segal gave a definition of a kind of operator algebras called C*-algebras that on the one hand
made no reference to an underlying Hilbert space, and on the other extrapolated many of the useful features of the
operator algebras that had previously been studied. The spectral theorem for self-adjoint operators in particular that
underlies much of the existing Hilbert space theory was generalized to C*-algebras. These techniques are now basic
in abstract harmonic analysis and representation theory.
Hilbert space
139
Examples
Lebesgue spaces
Lebesgue spaces are function spaces associated to measure spaces (X, M, fi), where X is a set, M is a a-algebra of
subsets of X, and fi is a countably additive measure on M. Let L (X,\x) be the space of those complex- valued
measurable functions on X for which the Lebesgue integral of the square of the absolute value of the function is
finite, i.e., for a function fin L 2 (X,\i),
\f\ 2 dfJ. < 00,
and where functions are identified if and only if they differ only on a set of measure zero.
2
The inner product of functions /and g in L (X,\x) is then defined as
(/,<?>= / f{tW)M*)-
J X
2
For /and g in L , this integral exists because of the Cauchy-Schwarz inequality, and defines an inner product on the
2 r20i
space. Equipped with this inner product, L is in fact complete. The Lebesgue integral is essential to ensure
T211
completeness: on domains of real numbers, for instance, not enough functions are Riemann integrable.
2 2
The Lebesgue spaces appear in many natural settings. The spaces L (R) and L ([0,1]) of square-integrable functions
with respect to the Lebesgue measure on the real line and unit interval, respectively, are natural domains on which to
define the Fourier transform and Fourier series. In other situations, the measure may be something other than the
ordinary Lebesgue measure on the real line. For instance, if w is any positive measurable function, the space of all
measurable functions /on the interval [0,1] satisfying
f 1 \f(t)\ 2 w(t) dt < oo
Jo
is called the weighted L 2 space L % „ ([0,1]), and w is called the weight function. The inner product is defined by
</,<?)= f f(t)W)w{t)dt.
Jo
The weighted space L ^,,([0,1]) is identical with the Hilbert space L 2 ([0,l],|i) where the measure \x of a
Lebesgue-measurable set A is defined by
fi(A) = [ w(t)dt.
J A
2
Weighted L spaces like this are frequently used to study orthogonal polynomials, because different families of
orthogonal polynomials are orthogonal with respect to different weighting functions.
Sobolev spaces
s s 2
Sobolev spaces, denoted by H or W ' , are Hilbert spaces. These are a special kind of function space in which
differentiation may be performed, but which (unlike other Banach spaces such as the Holder spaces) support the
structure of an inner product. Because differentiation is permitted, Sobolev spaces are a convenient setting for the
T221
theory of partial differential equations. They also form the basis of the theory of direct methods in the calculus of
variations/ 23 ^
For s a non-negative integer and £2 C R^, the Sobolev space ffid) contains L 2 functions whose weak derivatives of
order up to s are also L 2 . The inner product in 7/ s (£2) is
</,<?}= / f(x)g(x)dx+ f Df-Dg(x)dx + ...+ f D s f(x) ■ D s g{x)dx
Jo, Jn Jn
where the dot indicates the dot product in the Euclidean space of partial derivatives of each order. Sobolev spaces
can also be defined when s is not an integer.
Hilbert space
140
Sobolev spaces are also studied from the point of view of spectral theory, relying more specifically on the Hilbert
space structure. If £1 is a suitable domain, then one can define the Sobolev space H s (£l) as the space of Bessel
potentials;^ roughly,
F(n) = {(l-A)- s / 2 /|/a 2 (n)}.
—s/2
Here A is the Laplacian and (1 - A) is understood in terms of the spectral mapping theorem. Apart from
providing a workable definition of Sobolev spaces for non-integer s, this definition also has particularly desirable
properties under the Fourier transform that make it ideal for the study of pseudodifferential operators. Using these
methods on a compact Riemannian manifold, one can obtain for instance the Hodge decomposition which is the
basis of Hodge theory. 1
Spaces of holomorphic functions
Hardy spaces
The Hardy spaces are function spaces, arising in complex analysis and harmonic analysis, whose elements are
certain holomorphic functions in a complex domain J 26 -' Let U denote the unit disc in the complex plane. Then the
2
Hardy space H (U) is defined to be the space of holomorphic functions /on U such that the means
1 r 2lT
M r(f) = ^J 0 \f(re i6 )\ 2 de
remain bounded for r < 1 . The norm on this Hardy space is defined by
a = lim yjMriJ).
es in the disc a
oc
f(z) = £ a„z"
2
Hardy spaces in the disc are related to Fourier series. A function /is in H (U) if and only if
71=0
where
£ M 2 < oo.
71=0
2 2
Thus H (U) consists of those functions which are L on the circle, and whose negative frequency Fourier coefficients
vanish.
Bergman spaces
[27]
The Bergman spaces are another family of Hilbert spaces of holomorphic functions. Let D be a bounded open set
2 h
in the complex plane (or a higher dimensional complex space) and let L ' (D) be the space of holomorphic functions
2
/in D that are also in L (D) in the sense that
= / |/(*)| 2 d/x(z)<oo,
JD
2 h 2
where the integral is taken with respect to the Lebesgue measure in D. Clearly L ' (D) is a subspace of L {D)\ in fact,
it is a closed subspace, and so a Hilbert space in its own right. This is a consequence of the estimate, valid on
compact subsets K of D, that
Biipl/tol^Cjcll/Ha,
zeK
which in turn follows from Cauchy's integral formula. Thus convergence of a sequence of holomorphic functions in
2
L (D) implies also compact convergence, and so the limit function is also holomorphic. Another consequence of this
2 h
inequality is that the linear functional that evaluates a function /at a point of D is actually continuous on L ' (D).
2 h
The Riesz representation theorem implies that the evaluation functional can be represented as an element of L ' (D).
2 h
Thus, for every z £ D, there is a function ij^EL' (D) such that
Hilbert space
141
M= [ /(Cto.(C)4rtC)
Jd
for all/E L 2,h (D). The integrand
* (C, *) = £G5
is known as the Bergman kernel of Z). This integral kernel satisfies a reproducing property
/(*)= / /(c)Jf(C,*)^(0-
JD
A Bergman space is an example of a reproducing kernel Hilbert space, which is a Hilbert space of functions along
2
with a kernel K(C,,z) that verifies a reproducing property analogous to this one. The Hardy space H (D) also admits a
reproducing kernel, known as the Szego kernel Reproducing kernels are common in other areas of mathematics
as well. For instance, in harmonic analysis the Poisson kernel is a reproducing kernel for the Hilbert space of
square-integrable harmonic functions in the unit ball. That the latter is a Hilbert space at all is a consequence of the
mean value theorem for harmonic functions.
Applications
Many of the applications of Hilbert spaces exploit the fact that Hilbert spaces support generalizations of simple
geometric concepts like projection and change of basis from their usual finite dimensional setting. In particular, the
spectral theory of continuous self-adjoint linear operators on a Hilbert space generalizes the usual spectral
decomposition of a matrix, and this often plays a major role in applications of the theory to other areas of
mathematics and physics.
Sturm-Liouville theory
In the theory of ordinary differential equations, spectral methods on a
suitable Hilbert space are used to study the behavior of eigenvalues and
eigenfunctions of differential equations. For example, the
Sturm-Liouville problem arises in the study of the harmonics of waves
in a violin string or a drum, and is a central problem in ordinary
differential equations. The problem is a differential equation of the
form
The overtones of a vibrating string. These are
eigenfunctions of an associated Sturm-Liouville
problem. The eigenvalues 1,1/2,1/3,... form the
(musical) harmonic series.
d
dx
i \ d y
+ q(x)y = Xw(x)y
for an unknown function y on an interval [a,b], satisfying general homogeneous Robin boundary conditions
(ay(a) + a'y'(a) = 0
\{3y(b) + f3'y'(b) = 0.
The functions p, q, and w are given in advance, and the problem is to find the function y and constants X for which
the equation has a solution. The problem only has solutions for certain values of X, called eigenvalues of the system,
Hilbert space
142
and this is a consequence of the spectral theorem for compact operators applied to the integral operator defined by
the Green's function for the system. Furthermore, another consequence of this general result is that the eigenvalues X
of the system can be arranged in an increasing sequence tending to infinity P 0 ^
Partial differential equations
[221
Hilbert spaces form a basic tool in the study of partial differential equations. For many classes of partial
differential equations, such as linear elliptic equations, it is possible to consider a generalized solution (known as a
weak solution) by enlarging the class of functions. Many weak formulations involve the class of Sobolev functions,
which is a Hilbert space. A suitable weak formulation reduces to a geometrical problem the analytic problem of
finding a solution or, often what is more important, showing that a solution exists and is unique for given boundary
data. For linear elliptic equations, one geometrical result that ensures unique solvability for a large class of problems
is the Lax-Milgram theorem. This strategy forms the rudiment of the Galerkin method (a finite element method) for
T311
numerical solution of partial differential equations.
2
A typical example is the Poisson equation -Au = g with Dirichlet boundary conditions in a bounded domain Q in R .
The weak formulation consists of finding a function u such that, for all continuously differentiable functions v in £1
vanishing on the boundary:
f Vu-Vv= [
Jn Jn
gv.
This can be recast in terms of the Hilbert space H consisting of functions u such that u, along with its weak
partial derivatives, are square integrable on Q, and which vanish on the boundary. The question then reduces to
finding u in this space such that for all v in this space
a(it, v) = b(v)
where a is a continuous bilinear form, and b is a continuous linear functional, given respectively by
a(u, v) = I Vu • Vv, b(v) = / gv.
Since the Poisson equation is elliptic, it follows from Poincare's inequality that the bilinear form a is coercive. The
Lax-Milgram theorem then ensures the existence and uniqueness of solutions of this equation.
Hilbert spaces allow for many elliptic partial differential equations to be formulated in a similar way, and the
Lax-Milgram theorem is then a basic tool in their analysis. With suitable modifications, similar techniques can be
applied to parabolic partial differential equations and certain hyperbolic partial differential equations.
Ergodic theory
The field of ergodic theory is the study of the long-term behavior of
chaotic dynamical systems. The protypical case of a field to which
ergodic theory is applicable is that of thermodynamics in which,
although the microscopic state of a system is extremely
complicated — it is impossible to understand the ensemble of individual
collisions between particles of matter — the average behavior over
sufficiently long time intervals is tractable. The laws of
thermodynamics are assertions about such average behavior. In
particular, one formulation of the zeroth law of thermodynamics
asserts that over sufficiently long timescales, the only functionally
independent measurement that one can make of a thermodynamic system in equilibrium is its total energy, in the
form of temperature.
The path of a billiard ball in the Bunimovich
stadium is described by an ergodic dynamical
system.
Hilbert space
143
An ergodic dynamical system is one for which, apart from the energy — measured by the Hamiltonian — there are no
other functionally independent conserved quantities on the phase space. More explicitly, suppose that the energy E is
fixed, and let Q £ be the subset of the phase space consisting of all states of energy E (an energy surface), and let !T
denote the evolution operator on the phase space. The dynamical system is ergodic if there are no continuous
non-constant functions onH, such that
E
f(T t w) = f(w)
for all w on Q and all time t. Liouville's theorem implies that there exists a measure [i on the energy surface that is
invariant under the time translation. As a result, time translation is a unitary transformation of the Hilbert space
2
L (£l E ,\i) consisting of square-integrable functions on the energy surface Q with respect to the inner product
{f,g)L*(n B3 p) = / fgdp>
J E
The von Neumann mean ergodic theorem^ states the following:
• If U is a (strongly continuous) one-parameter semigroup of unitary operators on a Hilbert space H, and P is the
orthogonal projection onto the space of common fixed points of £/ , {xGH I Ux = x for all t > 0}, then
1 f T
Px— lim — / U t xdt.
T^oc T Jo
For an ergodic system, the fixed set of the time evolution consists only of the constant functions, so the ergodic
[32] 2
theorem implies the following: for any function /G L (Q. 9 \i),
L 2 -limi f f(T t w)dt= f f(y)dfi(y).
That is, the long time average of an observable /is equal to its expectation value over an energy surface.
Fourier analysis
One of the basic goals of Fourier analysis is to decompose a function
into a (possibly infinite) linear combination of given basis functions:
the associated Fourier series. The classical Fourier series associated to
a function /defined on the interval [0,1] is a series of the form
Superposition of sinusoidal wave basis functions
(bottom) to form a sawtooth wave (top)
Hilbert space
144
Spherical harmonics, an orthonormal basis for the
Hilbert space of square-integrable functions on
the sphere, shown graphed along the radial
direction
71= — OO
where
a n = f f{0)t
JO
The example of adding up the first few terms in a Fourier series for a sawtooth function is shown in the figure. The
basis functions are sine waves with wavelengths TJn (n=integer) shorter than the wavelength X of the sawtooth itself
(except for n=l, the fundamental wave). All basis functions have nodes at the nodes of the sawtooth, but all but the
fundamental have additional nodes. The oscillation of the summed terms about the sawtooth is called the Gibbs
phenomenon.
A significant problem in classical Fourier series asks in what sense the Fourier series converges, if at all, to the
function/. Hilbert space methods provide one possible answer to this question. The functions e (9) = e mn form
2 n
an orthogonal basis of the Hilbert space L ([0,1]). Consequently, any square-integrable function can be expressed as
a series
f(0) = J2a n e n (9) : a n = (/, e n )
2
and, moreover, this series converges in the Hilbert space sense (that is, in the L mean).
The problem can also be studied from the abstract point of view: every Hilbert space has an orthonormal basis, and
every element of the Hilbert space can be written in a unique way as a sum of multiples of these basis elements. The
coefficients appearing on these basis elements are sometimes known abstractly as the Fourier coefficients of the
element of the space P 4 ^ The abstraction is especially useful when it is more natural to use different basis functions
2
for a space such as L ([0,1]). In many circumstances, it is desirable not to decompose a function into trigonometric
T351
functions, but rather into orthogonal polynomials or wavelets for instance, and in higher dimensions into spherical
harmonics P 6 ^
2 2
For instance, if e are any orthonormal basis functions of L [0,1], then a given function in L [0,1] can be
n T371
approximated as a finite linear combination
fix) n f n (x) = a^e^x) + a 2 e 2 (x) H h a n e n (x)
2
The coefficients {a.} are selected to make the magnitude of the difference \\f - f\\ as small as possible.
Geometrically, the best approximation is the orthogonal projection of / onto the subspace consisting of all linear
R81
combinations of the {e.}, and can be calculated by
dj = J ej(x)f(x)dx.
2
That this formula minimizes the difference Wf-f II is a consequence of Bessel's inequality and Parseval's formula.
Hilbert space
145
In various applications to physical problems, a function can be decomposed into physically meaningful
eigenfunctions of a differential operator (typically the Laplace operator): this forms the foundation for the spectral
study of functions, in reference to the spectrum of the differential operator P 9 ^ A concrete physical application
involves the problem of hearing the shape of a drum: given the fundamental modes of vibration that a drumhead is
capable of producing, can one infer the shape of the drum itself? ^ The mathematical formulation of this question
involves the Dirichlet eigenvalues of the Laplace equation in the plane, that represent the fundamental modes of
vibration in direct analogy with the integers that represent the fundamental modes of vibration of the violin string.
Spectral theory also underlies certain aspects of the Fourier transform of a function. Whereas Fourier analysis
decomposes a function defined on a compact set into the discrete spectrum of the Laplacian (which corresponds to
the vibrations of a violin string or drum), the Fourier transform of a function is the decomposition of a function
defined on all of Euclidean space into its components in the continuous spectrum of the Laplacian. The Fourier
transformation is also geometrical, in a sense made precise by the Plancherel theorem, that asserts that it is an
isometry of one Hilbert space (the "time domain") with another (the "frequency domain"). This isometry property of
the Fourier transformation is a recurring theme in abstract harmonic analysis, as evidenced for instance by the
Plancherel theorem for spherical functions occurring in noncommutative harmonic analysis.
Quantum mechanics
In the mathematically rigorous formulation of quantum mechanics,
developed by Paul Dirac^ and John von Neumann ^ , the possible
states (more precisely, the pure states) of a quantum mechanical system
are represented by unit vectors (called state vectors) residing in a
complex separable Hilbert space, known as the state space, well
defined up to a complex number of norm 1 (the phase factor). In other
words, the possible states are points in the projectivization of a Hilbert
space, usually called the complex projective space. The exact nature of
this Hilbert space is dependent on the system; for example, the position
and momentum states for a single non-relativistic spin zero particle is
the space of all square-integrable functions, while the states for the
spin of a single proton are unit elements of the two-dimensional
complex Hilbert space of spinors. Each observable is represented by a
self-adjoint linear operator acting on the state space. Each eigenstate of
an observable corresponds to an eigenvector of the operator, and the associated eigenvalue corresponds to the value
of the observable in that eigenstate.
The orbitals of an electron in a hydrogen atom are
eigenfunctions of the energy.
The time evolution of a quantum state is described by the Schrodinger equation, in which the Hamiltonian, the
operator corresponding to the total energy of the system, generates time evolution.
The inner product between two state vectors is a complex number known as a probability amplitude. During an ideal
measurement of a quantum mechanical system, the probability that a system collapses from a given initial state to a
particular eigenstate is given by the square of the absolute value of the probability amplitudes between the initial and
final states. The possible results of a measurement are the eigenvalues of the operator — which explains the choice of
self-adjoint operators, for all the eigenvalues must be real. The probability distribution of an observable in a given
state can be found by computing the spectral decomposition of the corresponding operator.
For a general system, states are typically not pure, but instead are represented as statistical mixtures of pure states, or
mixed states, given by density matrices: self-adjoint operators of trace one on a Hilbert space. Moreover, for general
quantum mechanical systems, the effects of a single measurement can influence other parts of a system in a manner
that is described instead by a positive operator valued measure. Thus the structure both of the states and observables
Hilbert space
146
in the general theory is considerably more complicated than the idealization for pure states.
Heisenberg's uncertainty principle is represented by the statement that the operators corresponding to certain
observables do not commute, and gives a specific form that the commutator must have.
Properties
Pythagorean identity
Two vectors u and v in a Hilbert space H are orthogonal when (u,v) =0. The notation for this is u J_ v. More
generally, when S is a subset in H, the notation u ± S means that u is orthogonal to every element from S.
When u and v are orthogonal, one has
\\u + v\\ 2 = {u + u + v) = (u, u) + 2 Re(ii, v) + (v, v) = ||ii|| 2 + ||^|| 2 .
By induction on n, this is extended to any family u ,...,u of n orthogonal vectors,
ui H h ^|| 2 = ||tii|| 2 H h ||ie,
Whereas the Pythagorean identity as stated is valid in any inner product space, completeness is required for the
extension of the Pythagorean identity to series. A series 2 of orthogonal vectors converges in H if and only if the
series of squares of norms converges, and
oc oo
IIX^II 2 = Eii^ii 2 -
k=0 fc=0
Furthermore, the sum of a series of orthogonal vectors is independent of the order in which it is taken.
Parallelogram identity and polarization
By definition, every Hilbert space is also a Banach space. Furthermore,
in every Hilbert space the following parallelogram identity holds:
Geometrically, the parallelogram identity asserts
that AC 2 + BD 2 =
2(AB 2 + AD 2 ). In words, the
sum of the squares of the diagonals is twice the
sum of the squares of any two adjacent sides.
|| u + t ;|| 2 + || W -t,|| 2 = 2(H| 2 + ||i;|| 2 ).
Conversely, every Banach space in which the parallelogram identity holds is a Hilbert space, and the inner product is
uniquely determined by the norm by the polarization identity J 43 ^ For real Hilbert spaces, the polarization identity is
{u, v) = - ^|| it + || 2 — \\u — i;|| 2 ^
4
For complex Hilbert spaces, it is
(Ujv) = -^\\u + v\\ 2 — \\u — v\\ 2 + i\\u + iv\\ 2 — i\\u — iv\\ 2 ^ .
The parallelogram law implies that any Hilbert space is a uniformly convex Banach space J 44 ^
Hilbert space 147
Best approximation
If C is a non-empty closed convex subset of a Hilbert space H and x a point in H, there exists a unique point y G C
which minimizes the distance between x and points in c} 45 ^
y eC, \\x - y\\ = dist(x,C) = min{||x - z\\ : z G C}.
This is equivalent to saying that there is a point with minimal norm in the translated convex set D = C - x. The proof
consists in showing that every minimizing sequence (<i ) C D is Cauchy (using the parallelogram identity) hence
converges (using completeness) to a point in D that has minimal norm. More generally, this holds in any uniformly
convex Banach space J 46 ^
When this result is applied to a closed subspace F of H, it can be shown that the point y G F closest to x is
characterized by'- 47 -'
y£F, x-y±F.
This point y is the orthogonal projection of x onto F, and the mapping P : x —> y is linear (see Orthogonal
t
complements and projections). This result is especially significant in applied mathematics, especially numerical
analysis, where it forms the basis of least squares methods.
In particular, when F is not equal to H, one can find a non-zero vector v orthogonal to F (select x not in F and v = x-
y). A very useful criterion is obtained by applying this observation to the closed subspace F generated by a subset S
ofH.
A subset S of H spans a dense vector subspace if (and only if) the vector 0 is the sole vector v G H orthogonal
to S.
Duality
The dual space H* is the space of all continuous linear functions from the space H into the base field. It carries a
natural norm, defined by
IMI = su p
\\x\\=i,xeH
This norm satisfies the parallelogram law, and so the dual space is also an inner product space. The dual space is also
complete, and so it is a Hilbert space in its own right.
The Riesz representation theorem affords a convenient description of the dual. To every element u of H, there is a
unique element cp of H*, defined by
tp u (x) = {x,u).
The mapping U \— > ip u is an antilinear mapping from H to H* . The Riesz representation theorem states that this
T481 *
mapping is an antilinear isomorphism. Thus to every element cp of the dual H there exists one and only one in
H such that
{x,u v ) = tp{x)
for all x G H. The inner product on the dual space H* satisfies
The reversal of order on the right-hand side restores linearity in cp from the antilinearity of u . In the real case, the
antilinear isomorphism from H to its dual is actually an isomorphism, and so real Hilbert spaces are naturally
isomorphic to their own duals.
The representing vector is obtained in the following way. When cp ± 0, the kernel F = ker cp is a closed vector
subspace of H, not equal to H, hence there exists a non-zero vector v orthogonal to F. The vector u is a suitable
scalar multiple Av of v. The requirement that ^(v) = (v, u) yields
u = {v, v) 1 ip{v) v.
Hilbert space
148
This correspondence cp <-» u is exploited by the bra-ket notation popular in physics. It is common in physics to
assume that the inner product, denoted by (x\y), is linear on the right,
(x\y) = (y,x).
The result (x\y) can be seen as the action of the linear functional (x\ (the bra) on the vector \y) (the kef).
The Riesz representation theorem relies fundamentally not just on the presence of an inner product, but also on the
completeness of the space. In fact, the theorem implies that the topological dual of any inner product space can be
identified with its completion. An immediate consequence of the Riesz representation theorem is also that a Hilbert
space H is reflexive, meaning that the natural map from H into its double dual space is an isomorphism.
Weakly convergent sequences
In a Hilbert space H, a sequence {x } is weakly convergent to a vector xE// when
lim(x n , v) = (x, v)
71
for every v £ H.
For example, any orthonormal sequence {f^} converges weakly to 0, as a consequence of Bessel's inequality. Every
weakly convergent sequence {x } is bounded, by the uniform boundedness principle.
Conversely, every bounded sequence in a Hilbert space admits weakly convergent subsequences (Alaoglu's
theorem)J 49] This fact may be used to prove minimization results for continuous convex functionals, in the same
way that the Bolzano- Weierstrass theorem is used for continuous functions on R^. Among several variants, one
simple statement is as follows: t50]
Iff: H — » R is a convex continuous function such that/(x) tends to +°o when llxll tends to ©o, then /admits a
minimum at some point x Q G H.
This fact (and its various generalizations) are fundamental for direct methods in the calculus of variations.
Minimization results for convex functionals are also a direct consequence of the slightly more abstract fact that
closed bounded convex subsets in a Hilbert space H are weakly compact, since H is reflexive. The existence of
weakly convergent subsequences is a special case of the Eberlein-Smulian theorem.
Banach space properties
Any general property of Banach spaces continues to hold for Hilbert spaces. The open mapping theorem states that a
continuous surjective linear transformation from one Banach space to another is an open mapping meaning that it
sends open sets to open sets. A corollary is the bounded inverse theorem, that a continuous and bijective linear
function from one Banach space to another is an isomorphism (that is, a continuous linear map whose inverse is also
continuous). This theorem is considerably simpler to prove in the case of Hilbert spaces than in general Banach
spaces.^ ^ The open mapping theorem is equivalent to the closed graph theorem, which asserts that a function from
one Banach space to another is continuous if and only if its graph is a closed set. In the case of Hilbert spaces, this
is basic in the study of unbounded operators (see closed operator).
The (geometrical) Hahn-Banach theorem asserts that a closed convex set can be separated from any point outside it
by means of a hyperplane of the Hilbert space. This is an immediate consequence of the best approximation
property: if y is the element of a closed convex set F closest to x, then the separating hyperplane is the plane
perpendicular to the segment xy passing through its midpoint.
The distortion problem on Hilbert space asks whether or not every real valued Lipschitz function / defined on the
sphere of a separable and infinite dimensional Hilbert space JJ stabilizes on the sphere of an infinite dimensional
subspace, i.e. whether there is a real number a G R so that for every 6 > 0 there is an infinite dimensional subspace
Y°f H > so mat ' a_ /(y)'< f° r al l y ^ Y, with llyll=l. This problem was solved negatively by E. Odell and
Th.Schlumprecht (1994).
Hilbert space
149
Operators on Hilbert spaces
Bounded operators
The continuous linear operators A : —> from a Hilbert space to a second Hilbert space are bounded in
the sense that they map bounded sets to bounded sets. Conversely, if an operator is bounded, then it is continuous.
The space of such bounded linear operators has a norm, the operator norm given by
\\A\\ = sup{ \\Ax\\ : < 1}.
The sum and the composite of two bounded linear operators is again bounded and linear. For y in H^, the map that
sends x G H to <Ax, y> is linear and continuous, and according to the Riesz representation theorem can therefore be
represented in the form
(x,A*y) = (Ax,y)
for some vector A* y in This defines another bounded linear operator A* : H^—> H^ 9 the adjoint of A. One can see
that A** = A.
The set B(H) of all bounded linear operators on H, together with the addition and composition operations, the norm
and the adjoint operation, is a C -algebra, which is a type of operator algebra.
An element A of B(H) is called self -adjoint or Hermitian if A* = A. If A is Hermitian and (Ax, x) > 0 for every x,
then A is called non-negative, written A > 0; if equality holds only when x = 0, then A is called positive. The set of
self adjoint operators admits a partial order, in which A > B if A - B > 0. If A has the form B*B for some B, then A is
non-negative; if B is invertible, then A is positive. A converse is also true in the sense that, for a non-negative
operator A, there exists a unique non-negative square root B such that
A = B 2 = B*B.
In a sense made precise by the spectral theorem, self-adjoint operators can usefully be thought of as operators that
are "real". An element A of B(H) is called normal if A*A = A A*. Normal operators decompose into the sum of a
self-adjoint operators and an imaginary multiple of a self adjoint operator
A + A* .{A- A*)
A ~^^ + l 2i
that commute with each other. Normal operators can also usefully be thought of in terms of their real and imaginary
parts.
An element U of B(H) is called unitary if U is invertible and its inverse is given by U*. This can also be expressed by
requiring that U be onto and ( Ux, Uy) = (x, y) for all x and y in H. The unitary operators form a group under
composition, which is the isometry group of H.
An element of B(H) is compact if it sends bounded sets to relatively compact sets. Equivalently, a bounded operator
Tis compact if, for any bounded sequence {jc }, the sequence {Tx } has a convergent subsequence. Many integral
operators are compact, and in fact define a special class of operators known as Hilbert-Schmidt operators that are
especially important in the study of integral equations. Fredholm operators are those which differ from a compact
operator by a multiple of the identity, and are equivalently characterized as operators with a finite dimensional kernel
and cokernel. The index of a Fredholm operator Tis defined by
index T = dim ker T — dim coker T.
The index is homotopy invariant, and plays a deep role in differential geometry via the Atiyah-Singer index
theorem.
Hilbert space
150
Unbounded operators
Unbounded operators are also tractable in Hilbert spaces, and have important applications to quantum mechanics P 4 ^
An unbounded operator T on a Hilbert space H is defined to be a linear operator whose domain D(T) is a linear
subspace of H. Often the domain D(T) is a dense subspace of H, in which case T is known as a densely-defined
operator.
The adjoint of a densely defined unbounded operator is defined in essentially the same manner as for bounded
operators. Self-adjoint unbounded operators play the role of the observable s in the mathematical formulation of
quantum mechanics. Examples of self-adjoint unbounded operators on the Hilbert space L 2 (R) are:'" 55 -'
• A suitable extension of the differential operator
(Af)(x)=i±f(x),
where / is the imaginary unit and /is a differentiable function of compact support.
• The multiplication-by-x operator:
(Bf)(x) = xf(x).
These correspond to the momentum and position observables, respectively. Note that neither A nor B is defined on
all of H, since in the case of A the derivative need not exist, and in the case of B the product function need not be
2
square integrable. In both cases, the set of possible arguments form dense subspaces of L (R).
Constructions
Direct sums
Two Hilbert spaces and H 2 can be combined into another Hilbert space, called the (orthogonal) direct sum, t56]
and denoted
h x e h 2 ,
consisting of the set of all ordered pairs (x^, x^) where x. G H., i = 1,2, and inner product defined by
More generally, if H. is a family of Hilbert spaces indexed by / G /, then the direct sum of the H., denoted
iei
consists of the set of all indexed families
x=(x i £H i \i£l)£l[H i
iei
in the Cartesian product of the H such that
^ \\xi\\ 2 < OC.
iei
The inner product is defined by
iei
Each of the H. is included as a closed subspace in the direct sum of all of the H.. Moreover, the H. are pairwise
orthogonal. Conversely, if there is a system of closed subspaces V , i G /, in a Hilbert space H which are pairwise
orthogonal and whose union is dense in H, then H is canonically isomorphic to the direct sum of V.. In this case, H is
called the internal direct sum of the V. A direct sum (internal or external) is also equipped with a family of
orthogonal projections E. onto the ith direct summand These projections are bounded, self-adjoint, idempotent
operators which satisfy the orthogonality condition
Hilbert space
151
EkE 3 = 0, i± j.
The spectral theorem for compact self-adjoint operators on a Hilbert space H states that H splits into an orthogonal
direct sum of the eigenspaces of an operator, and also gives an explicit decomposition of the operator as a sum of
projections onto the eigenspaces. The direct sum of Hilbert spaces also appears in quantum mechanics as the Fock
space of a system containing a variable number of particles, where each Hilbert space in the direct sum corresponds
to an additional degree of freedom for the quantum mechanical system. In representation theory, the Peter- Weyl
theorem guarantees that any unitary representation of a compact group on a Hilbert space splits as the direct sum of
finite-dimensional representations .
Tensor products
If if 1 and H^, then one defines an inner product on the (ordinary) tensor product as follows. On simple tensors, let
This formula then extends by sesquilinearity to an inner product on Hi ® fl*2- The Hilbertian tensor product of
and if , sometimes denoted by H\®H2> * s me Hilbert space obtained by completing H\ ® i?2f° r me metric
associated to this inner product.
2 2
An example is provided by the Hilbert space L ([0, 1]). The Hilbertian tensor product of two copies of L ([0, 1]) is
2 2 2
isometrically and linearly isomorphic to the space L ([0, 1] ) of square-integrable functions on the square [0, 1] .
This isomorphism sends a simple tensor fi ® /j^ 0 me function
(s,t) » Ms) f 2 (t)
on the square.
T581
This example is typical in the following sense. Associated to every simple tensor product X\ ® rz^is the rank
one operator
x* € H* — > x*{xi) X2
from the (continuous) dual if * to if . This mapping defined on simple tensors extends to a linear identification
between Hi ® i?2 an d me space of finite rank operators from H* to H^. This extends to a linear isometry of the
Hilbertian tensor product i^gj/i^with me Hilbert space HS{H*, of Hilbert-Schmidt operators from if * to
Orthonormal bases
The notion of an orthonormal basis from linear algebra generalizes over to the case of Hilbert spaces. In a Hilbert
space if, an orthonormal basis is a family {e k ) k e B of elements of H satisfying the conditions:
1 . Orthogonality: Every two different elements of B are orthogonal: (e , e .)= 0 for all k, j in B with k * j.
2. Normalization: Every element of the family has norm 1:11^11 = 1 for all k in B.
3. Completeness: The linear span of the family e , k G B, is dense in H.
A system of vectors satisfying the first two conditions basis is called an orthonormal system or an orthonormal set
(or an orthonormal sequence if B is countable). Such a system is always linearly independent. Completeness of an
orthonormal system of vectors of a Hilbert space can be equivalently restated as:
if (v, e^j = 0 for all k € B and some v G if then v = 0.
This is related to the fact that the only vector orthogonal to a dense linear subspace is the zero vector, for if S is any
orthonormal set and v is orthogonal to S, then v is orthogonal to the closure of the linear span of S, which is the
whole space.
Examples of orthonormal bases include:
• the set {(1,0,0), (0,1,0), (0,0,1)} forms an orthonormal basis of R with the dot product;
Hilbert space
152
2
• the sequence [f^ : n G Z} with/^(x) = exp(2jtmx) forms an orthonormal basis of the complex space L ([0,1]);
In the infinite-dimensional case, an orthonormal basis will not be a basis in the sense of linear algebra; to distinguish
the two, the latter basis is also called a Hamel basis. That the span of the basis vectors is dense implies that every
vector in the space can be written as the sum of an infinite series, and the orthogonality implies that this
decomposition is unique.
Sequence spaces
2
The space D of square- summable sequences of complex numbers has an orthonormal basis
ei = (l,0,0,...)
e 2 = (0,l,0,...)
More generally, if B is any set, then one can form a Hilbert space of sequences with index set B, defined by
£ 2 (B) = {x : B^C | \*( b )\ 2 < °°}-
beB
The summation over B is here defined by
Y,\*(b)\ 2 =su P f:\x(b n )\i
beB n=i
the supremum being taken over all finite subsets of B. It follows that, in order for this sum to be finite, every element
2
of D (B) has only countably many nonzero terms. This space becomes a Hilbert space with the inner product
v) = S x ( b )y( b )
beB
2
for all x and y in D (B). Here the sum also has only countably many nonzero terms, and is unconditionally convergent
by the Cauchy-Schwarz inequality.
2
An orthonormal basis of D (B) is indexed by the set B, given by
e b (b f ) =
1 if b = b'
0 otherwise.
Bessel's inequality and Parseval's formula
Let/ 1? . . .,f n be a finite orthonormal system in H. For an arbitrary vector x in H, let
71
i=i
Then {x,f^ = f° r every k = 1, . . ., n. It follows that x - y is orthogonal to each/^, hence x - y is orthogonal to y.
Using the Pythagorean identity twice, it follows that
||x|| 2 = ||^ - + II2/H 2 > ||y|| 2 = f: |{^ /: ,->| 2 .
Let {f. },/€/, be an arbitrary orthonormal system in H. Applying the preceding inequality to every finite subset / of
/ gives the Bessel inequality^
Y,\(x,m 2 <\\x\\\ xGH
iei
(according to the definition of the sum of an arbitrary family of non-negative real numbers).
Geometrically, Bessel's inequality implies that the orthogonal projection of x onto the linear subspace spanned by the
f. has norm that does not exceed that of x. In two dimensions, this is the assertion that the length of the leg of a right
Hilbert space
153
triangle may not exceed the length of the hypotenuse.
Bessel's inequality is a stepping stone to the more powerful Parseval identity which governs the case when Bessel's
inequality is actually an equality. If {e^ k £ ^ is an orthonormal basis of H, then every element x of H may be written
as
keB
Even if B is uncountable, Bessel's inequality guarantees that the expression is well-defined and consists only of
countably many nonzero terms. This sum is called the Fourier expansion of x, and the individual coefficients (x,e^
are the Fourier coefficients of x. Parseval's formula is then
Hx|| 2 = eimi 2 .
Conversely, if {e^\ is an orthonormal set such that Parseval's identity holds for every x, then {e^\ is an orthonormal
basis.
Hilbert dimension
As a consequence of Zorn's lemma, every Hilbert space admits an orthonormal basis; furthermore, any two
orthonormal bases of the same space have the same cardinality, called the Hilbert dimension of the space/ 61] For
2
instance, since D (B) has an orthonormal basis indexed by B, its Hilbert dimension is the cardinality of B (which may
be a finite integer, or a countable or uncountable cardinal number).
2
As a consequence of Parseval's identity, if {e^ k g B is an orthonormal basis of H, then the map O : H — » £ (B)
defined by = (( x > e ])) keB * s an isometric isomorphism of Hilbert spaces: it is a bijective linear mapping such that
(^(x)^(y)) £ 2 {B) = {x,y) H
for all x and y in H. The cardinal number of B is the Hilbert dimension of H. Thus every Hilbert space is
isometrically isomorphic to a sequence space £ 2 {B) for some set B.
Separable spaces
A Hilbert space is separable if and only if it admits a countable orthonormal basis. All infinite-dimensional separable
Hilbert spaces are therefore isometrically isomorphic to .
In the past, Hilbert spaces were often required to be separable as part of the definition J 62 ^ Most spaces used in
physics are separable, and since these are all isomorphic to each other, one often refers to any infinite-dimensional
separable Hilbert space as "the Hilbert space" or just "Hilbert space" J 63 ^ Even in quantum field theory, most of the
Hilbert spaces are in fact separable, as stipulated by the Wightman axioms. However, it is sometimes argued that
non-separable Hilbert spaces are also important in quantum field theory, roughly because the systems in the theory
possess an infinite number of degrees of freedom and any infinite Hilbert tensor product (of spaces of dimension
greater than one) is non- separable J 64 ^ For instance, a bosonic field can be naturally thought of as an element of a
tensor product whose factors represent harmonic oscillators at each point of space. From this perspective, the natural
state space of a boson might seem to be a non-separable space J 64 ^ However, it is only a small separable subspace of
the full tensor product that can contain physically meaningful fields (on which the observables can be defined).
Another non- separable Hilbert space models the state of an infinite collection of particles in an unbounded region of
space. An orthonormal basis of the space is indexed by the density of the particles, a continuous parameter, and since
the set of possible densities is uncountable, the basis is not countable J 64 ^
Hilbert space
154
Orthogonal complements and projections
If S is a subset of a Hilbert space H, the set of vectors orthogonal to S is defined by
S 1 - = {x G H : (x, 5) = 0 Vs G 5} .
is a closed subspace of H and so forms itself a Hilbert space. If V is a closed subspace of H, then V 1 is called the
orthogonal complement of V. In fact, every xinH can then be written uniquely as x = v + w, with v in V and w in
Therefore, 7f is the internal Hilbert direct sum of V and V 1 .
The linear operator : H — » // which maps x to v is called the orthogonal projection onto V. There is a natural
one-to-one correspondence between the set of all closed subspaces of H and the set of all bounded self-adjoint
2
operators P such that P =P. Specifically,
Theorem. The orthogonal projection P y is a self-adjoint linear operator on H of norm < 1 with the property
2 2
P = P^. Moreover, any self-adjoint linear operator E such that E = E is of the form P^, where V is the range
-vll.
[65]
of E. For every x in H, P is the unique element v of V which minimizes the distance \\x - vll.
This provides the geometrical interpretation of P^(x): it is the best approximation to x by elements of V.
An operator P such that P = P 2 = P* is called an orthogonal projection. The orthogonal projection P y onto a closed
subspace V of H is the adjoint of the inclusion mapping
i v : V -> ff,
meaning that
<i v x, 2/) = (x, iVy)
for all x G // and y G V. Projections and P y are called mutually orthogonal if PjjPy = 0- This is equivalent to U
and V being orthogonal as subspaces of H. As a result, the sum of the two projections P^and P^is only a projection
if U and V are orthogonal to each other, and in that case P u + P y = P u+y - The composite P^Py is generally not a
projection; in fact, the composite is a projection if and only if the two projections commute, and in that case
P U P V = P UnV
The operator norm of a projection P onto a non-zero closed subspace is equal to one:
llPxIl
||jP|| = sup = 1.
xEH,x^0 \\x\\
2
Every closed subspace V of a Hilbert space is therefore the image of an operator P of norm one such that P = P. In
fact this property characterizes Hilbert spaces
• A Banach space of dimension higher than 2 is (isometrically) a Hilbert space if and only if, to every closed
subspace V, there is an operator P y of norm one whose image is V such that Py = P v .
While this result characterizes the metric structure of a Hilbert space, the structure of a Hilbert space as a topological
vector space can itself be characterized in terms of the presence of complementary subspaces: ^
• A Banach space X is topologically and linearly isomorphic to a Hilbert space if and only if, to every closed
subspace V, there is a closed subspace W such that X is equal to the internal direct sum V © W •
The orthogonal complement satisfies some more elementary results. It is a monotone function in the sense that if
U C V , then V 1 " C U 1 " with equality holding if and only if V is contained in the closure of U. This result is a
special case of the Hahn-Banach theorem. The closure of a subspace can be completely characterized in terms of the
orthogonal complement: If V is a subspace of H, then the closure of V is equal to y^ 1 - • The orthogonal
complement is thus a Galois connection on the partial order of subspaces of a Hilbert space. In general, the
orthogonal complement of a sum of subspaces is the intersection of the orthogonal complements:^ 68 ^
(Ei Vi) 1 ' = Hi Vi~ • If the V i are in addition closed, then = (Hi ■
Hilbert space
155
Spectral theory
There is a well-developed spectral theory for self-adjoint operators in a Hilbert space, that is roughly analogous to
the study of symmetric matrices over the reals or self-adjoint matrices over the complex numbers J 69 ^ In the same
sense, one can obtain a "diagonalization" of a self-adjoint operator as a suitable sum (actually an integral) of
orthogonal projection operators.
The spectrum of an operator T, denoted o(T) is the set of complex numbers X such that T-X lacks a continuous
inverse. If T is bounded, then the spectrum is always a compact set in the complex plane, and lies inside the disc
H<||T||. If Tis self-adjoint, then the spectrum is real. In fact, it is contained in the interval [m,M] where
m — inf (Tx, x), M— sup (Tx,x).
11*11=1 H|=i
Moreover, m and M are both actually contained within the spectrum.
The eigenspaces of an operator T are given by
H\ = ker(T — A).
Unlike with finite matrices, not every element of the spectrum of T must be an eigenvalue: the linear operator T - X
may only lack an inverse because it is not surjective. Elements of the spectrum of an operator in the general sense are
known as spectral values. Since spectral values need not be eigenvalues, the spectral decomposition is often more
subtle than in finite dimensions.
However, the spectral theorem of a self-adjoint operator T takes a particularly simple form if, in addition, T is
assumed to be a compact operator. The spectral theorem for compact self-adjoint operators states: ^
• A compact self-adjoint operator Thas only countably (or finitely) many spectral values. The spectrum of T has no
limit point in the complex plane except possibly zero. The eigenspaces of T decompose H into an orthogonal
direct sum:
\E<r(T)
Moreover, if E denotes the orthogonal projection onto the eigenspace H , then
A, A
T= £ XE X
where the sum converges with respect to the norm on B(H).
This theorem plays a fundamental role in the theory of integral equations, as many integral operators are compact, in
particular those that arise from Hilbert- Schmidt operators.
The general spectral theorem for self-adjoint operators involves a kind of operator- valued Riemann-Stieltjes
integral, rather than an infinite summation. 1 The spectral family associated to T associates to each real number X an
operator E^, which is the projection onto the nullspace of the operator (T — A) + , where the positive part of a
self-adjoint operator is defined by
The operators E are monotone increasing relative to the partial order defined on self-adjoint operators; the
A
eigenvalues correspond precisely to the jump discontinuities. One has the spectral theorem, which asserts
T= I XdE x .
Jr
The integral is understood as a Riemann-Stieltjes integral, convergent with respect to the norm on B(//). In
particular, one has the ordinary scalar- valued integral representation
(Tx : y) = / Xd{E x x,y).
Hilbert space
156
A somewhat similar spectral decomposition holds for normal operators, although because the spectrum may now
contain non-real complex numbers, the operator-valued Stieltjes measure dE. must instead be replaced by a
A
resolution of the identity.
A major application of spectral methods is the spectral mapping theorem, which allows one to apply to a self-adjoint
operator T any continuous complex function /defined on the spectrum of T by forming the integral
f(T)= f f(X)dE x .
Ja(T)
T721
The resulting continuous functional calculus has applications in particular to pseudodifferential operators.
The spectral theory of unbounded self-adjoint operators is only marginally more difficult than for bounded operators.
The spectrum of an unbounded operator is defined in precisely the same way as for bounded operators: X is a spectral
value if the resolvent operator
R\ = (T - A) -1
fails to be a well-defined continuous operator. The self-adjointness of T still guarantees that the spectrum is real.
Thus the essential idea of working with unbounded operators is to look instead at the resolvent R where X is
non-real. This is a bounded normal operator, which admits a spectral representation that can then be transferred to a
spectral representation of T itself. A similar strategy is used, for instance, to study the spectrum of the Laplace
operator: rather than address the operator directly, one instead looks as an associated resolvent such as a Riesz
potential or Bessel potential.
T731
A precise version of the spectral theorem which holds in this case is:
Given a densely-defined self-adjoint operator T on a Hilbert space H, there corresponds a unique resolution of
the identity E on the Borel sets of R, such that
(Tx,y)= [ XdE x , y (X)
Jr
for all xG D(T) and y €H. The spectral measure E is concentrated on the spectrum of T.
There is also a version of the spectral theorem that applies to unbounded normal operators.
Notes
[I] Marsden 1974, §2.8
[2] The mathematical material in this section can be found in any good textbook on functional analysis, such as Dieudonne (1960), Hewitt &
Stromberg (1965), Reed & Simon (1980) or Rudin (1980).
[3] In some conventions, inner products are linear in their second arguments instead.
[4] Dieudonne 1960, §6.2
[5] Dieudonne 1960
[6] Largely from the work of Hermann Grassmann, at the urging of August Ferdinand Mobius (Boyer & Merzbach 1991, pp. 584-586). The first
modern axiomatic account of abstract vector spaces ultimately appeared in Giuseppe Peano's 1888 account (Grattan-Guinness 2000, §5.2.2;
O'Connor & Robertson 1996).
[7] A detailed account of the history of Hilbert spaces can be found in Bourbaki 1987.
[8] Schmidt 1908
[9] Titchmarsh 1946, §IX.l
[10] Lebesgue 1904. Further details on the history of integration theory can be found in Bourbaki (1987) and Saks (2005).
[II] Bourbaki 1987.
[12] Dunford & Schwartz 1958, §IV.16
[13] In Dunford & Schwartz (1958, §IV.16), the result that every linear functional on L 2 [0,1] is represented by integration is jointly attributed to
Frechet (1907) and Riesz (1907). The general result, that the dual of a Hilbert space is identified with the Hilbert space itself, can be found in
Riesz (1934).
[14] von Neumann 1929.
[15] Kline 1972, p. 1092
[16] Hilbert, Nordheim & von Neumann 1927.
[17] Weyll931.
Hilbert space
157
[18] Prugovecki 1981, pp. 1-10.
[19] von Neumann 1932
[20] Halmos 1957, Section 42.
[21] Hewitt & Stromberg 1965.
[22] Bers, John & Schechter 1981.
[23] Giusti2003.
[24] Stein 1970
[25] Details can be found in Warner (1983).
[26] A general reference on Hardy spaces is the book Duren (1970).
[27] Krantz 2002, §1.4
[28] Krantz 2002, §1.5
[29] Young 1987, Chapter 9.
[30] The eigenvalues of the Fredholm kernel are l/X, which tend to zero.
[31] More detail on finite element methods from this point of view can be found in Brenner & Scott (2005).
[32] Reed & Simon 1980
[33] A treatment of Fourier series from this point of view is available, for instance, in Rudin (1987) or Folland (2009).
[34] Halmos 1957, §5
[35] Bachman, Narici & Beckenstein 2000
[36] Stein & Weiss 1971, §IV.2.
[37] Lancos 1988, pp. 212-213
[38] Lanczos 1988, Equation 4-3.10
[39] The classic reference for spectral methods is Courant & Hilbert 1953. A more up-to-date account is Reed & Simon 1975.
[40] Kacl966
[41] Dirac 1930
[42] von Neumann 1955
[43] Young 1988, p. 23.
[44] Clarkson 1936.
[45] Rudin 1987, Theorem 4.10
[46] Dunford & Schwartz 1958, II.4.29
[47] Rudin 1987, Theorem 4. 1 1
[48] Weidmann 1980, Theorem 4.8
[49] Weidmann 1980, §4.5
[50] Buttazzo, Giaquinta & Hildebrandt 1998, Theorem 5.17
[51] Halmos 1982, Problem 52, 58
[52] Rudin 1973
[53] Treves 1967, Chapter 18
[54] See Prugovecki (1981), Reed & Simon (1980, Chapter VIII) and Folland (1989).
[55] Prugovecki 1981, III, §1.4
[56] Dunford & Schwartz 1958, IV.4.17-18
[57] Weidmann 1980, §3.4
[58] Kadison & Ringrose 1983, Theorem 2.6.4
[59] Dunford & Schwartz 1958, §IV.4.
[60] For the case of finite index sets, see, for instance, Halmos 1957, §5. For infinite index sets, see Weidmann 1980, Theorem 3.6.
[61] Levitan 2001. Many authors, such as Dunford & Schwartz (1958, §IV.4), refer to this just as the dimension. Unless the Hilbert space is finite
dimensional, this is not the same thing as its dimension as a linear space (the cardinality of a Hamel basis).
[62] Prugovecki 1981,1, §4.2
[63] von Neumann (1955) defines a Hilbert space via a countable Hilbert basis, which amounts to an isometric isomorphism with £^ . The
convention still persists in most rigorous treatments of quantum mechanics; see for instance Sobrino 1996, Appendix B.
[64] Streater & Wightman 1964, pp. 86-87
[65] Young 1988, Theorem 15.3
[66] Kakutanil939
[67] Lindenstrauss & Tzafriri 1971
[68] Halmos 1957, §12
[69] A general account of spectral theory in Hilbert spaces can be found in Riesz & Sz Nagy (1990). A more sophisticated account in the
language of C*-algebras is in Rudin (1973) or Kadison & Ringrose (1997)
[70] See, for instance, Riesz & Sz Nagy (1990, Chapter VI) or Weidmann 1980, Chapter 7. This result was already known to Schmidt (1907) in
the case of operators arising from integral kernels.
[71] Riesz & Sz Nagy 1990, §§107-108
[72] Shubinl987
Hilbert space
158
[73] Rudin 1973, Theorem 13.30.
References
• Bachman, George; Narici, Lawrence; Beckenstein, Edward (2000), Fourier and wavelet analysis, Universitext,
Berlin, New York: Springer- Verlag, MR1729490, ISBN 978-0-387-98899-3.
• Bers, Lipman; John, Fritz; Schechter, Martin (1981), Partial differential equations, American Mathematical
Society, ISBN 0821800493.
• Bourbaki, Nicolas (1986), Spectral theories, Elements of mathematics, Berlin: Springer- Verlag, ISBN
0201007673.
• Bourbaki, Nicolas (1987), Topological vector spaces, Elements of mathematics, Berlin: Springer- Verlag,
ISBN 978-3540136279.
• Boyer, Carl Benjamin; Merzbach, Uta C (1991), A History of Mathematics (2nd ed.), John Wiley & Sons, Inc.,
ISBN 0-471-54397-7.
• Brenner, S.; Scott, R. L. (2005), The Mathematical Theory of Finite Element Methods (2nd ed.), Springer,
ISBN 0-3879-5451-1.
• Buttazzo, Giuseppe; Giaquinta, Mariano; Hildebrandt, Stefan (1998), One-dimensional variational problems,
Oxford Lecture Series in Mathematics and its Applications, 15, The Clarendon Press Oxford University Press,
MR1694383, ISBN 978-0-19-850465-8.
• Clarkson, J. A. (1936), "Uniformly convex spaces" (http://www.jstor.org/stable/1989630), Trans. Amer. Math.
Soc. 40 (3): 396-414, doi: 10.2307/1989630.
• Courant, Richard; Hilbert, David (1953), Methods of Mathematical Physics, Vol. I, Interscience.
• Dieudonne, Jean (1960), Foundations of Modern Analysis, Academic Press.
• Dirac, P.A.M. (1930), The Principles of Quantum Mechanics, Oxford: Clarendon Press.
• Dunford, N.; Schwartz, J.T. (1958), Linear operators, Parts I and II, Wiley-Interscience.
• Duren, P. (1970), Theory of H p -Spaces, New York: Academic Press.
• Folland, Gerald B. (2009), Fourier analysis and its application (http://books.google.com/
books?as_isbn=082 1847902) (Reprint of Wadsworth and Brooks/Cole 1992 ed.), American Mathematical Society
Bookstore, ISBN 0821847902.
• Folland, Gerald B. (1989), Harmonic analysis in phase space, Annals of Mathematics Studies, 122, Princeton
University Press, ISBN 0-691-08527-7.
• Frechet, Maurice (1907), "Sur les ensembles de fonctions et les operations lineaires", C. R. Acad. Sci. Paris 144:
1414-1416.
• Frechet, Maurice (1904-1907), Sur les operations lineaires.
• Giusti, Enrico (2003), Direct Methods in the Calculus of Variations, World Scientific, ISBN 981-238-043-4.
• Grattan-Guinness, Ivor (2000), The search for mathematical roots, 1870-1940, Princeton Paperbacks, Princeton
University Press, MR1807717, ISBN 978-0-691-05858-0.
• Halmos, Paul (1957), Introduction to Hilbert Space and the Theory of Spectral Multiplicity, Chelsea Pub. Co
• Halmos, Paul (1982), A Hilbert Space Problem Book, Springer- Verlag, ISBN
0387906851.
• Hewitt, Edwin; Stromberg, Karl (1965), Real and Abstract Analysis, Springer- Verlag.
• Hilbert, David; Nordheim, Lothar (Wolfgang); von Neumann, John (1927), "Uber die Grundlagen der
Quantenmechanik" (http://dz-srvl.sub.uni-goettingen.de/sub/digbib/loader?ht=VIEW&did=D27779),
Mathematische Annalen 98: 1-30, doi:10.1007/BF01451579.
• Kac, Mark (1966), "Can one hear the shape of a drum?" (http://jstor.org/stable/2313748), American
Mathematical Monthly 73 (4, part 2): 1-23, doi: 10.2307/23 13748.
• Kadison, Richard V.; Ringrose, John R. (1997), Fundamentals of the theory of operator algebras. Vol. I, Graduate
Studies in Mathematics, 15, Providence, R.L: American Mathematical Society, MR1468229,
Hilbert space
159
ISBN 978-0-8218-0819-1.
• Kakutani, Shizuo (1939), "Some characterizations of Euclidean space", Jap. J. Math. 16: 93-97, MR0000895.
• Kline, Morris (1972), Mathematical thought from ancient to modern times, Volume 3 (3rd ed.), Oxford University
Press (published 1990), ISBN 978-0195061376.
• Kolmogorov, Andrey; Fomin, Sergei V. (1970), Introductory Real Analysis (Revised English edition, trans, by
Richard A. Silverman (1975) ed.), Dover Press, ISBN 0-486-61226-0.
• Krantz, Steven G. (2002), Function Theory of Several Complex Variables, Providence, R.I.: American
Mathematical Society, ISBN 978-0-8218-2724-6.
• Lanczos, Cornelius (1988), Applied analysis (http://books. google. com/books ?as_isbn=04 8 665 65 6X) (Reprint
of 1956 Prentice-Hall ed.), Dover Publications, ISBN 048665656X.
• Lindenstrauss, J.; Tzafriri, L. (1971), "On the complemented subspaces problem", Israel Journal of Mathematics
9: 263-269, doi:10.1007/BF02771592, MR0276734, ISSN 0021-2172.
• O'Connor, John J.; Robertson, Edmund F. (1996), "Abstract linear spaces" (http://www-history.mcs. st-andrews.
ac.uk/HistTopics/Abstract_linear_spaces.html), MacTutor History of Mathematics archive, University of St
Andrews..
• Lebesgue, Henri (1904), Legons sur Vintegration et la recherche des fonctions primitives (http: //books. google.
com/?id=VfUKAAAAYAAJ&dq="Lebesgue" "LeA§ons sur l'intA©gration et la recherche des fonctions ..."&
pg=P A l#v=onepage&q=) , Gauthier- Villars .
• B.M. Levitan (2001), "Hilbert space" (http://eom.springer.de/H/h047380.htm), in Hazewinkel, Michiel,
Encyclopaedia of Mathematics, Springer, ISBN 978-1556080104.
• Marsden, Jerrold E. (1974), Elementary classical analysis, W. H. Freeman and Co., MR0357693.
• Odell, E.; Schlumprecht, Th. (1993), "The distortion problem of Hilbert space", Geom.Funct.Anal 3: 201-207,
doi: 10. 1007/BF01 896023, MR1209302, ISSN 1016-443X.
• Odell, E.; Schlumprecht, Th. (1994), "The distortion problem", Acta Mathematica 173: 259-281,
doi:10.1007/BF02398436, MR1301394, ISSN 0001-5962.
• Prugovecki, Eduard (1981), Quantum mechanics in Hilbert space (2nd ed.), Dover (published 2006),
ISBN 978-0486453279.
• Reed, Michael; Simon, Barry (1980), Functional Analysis, Methods of Modern Mathematical Physics, Academic
Press, ISBN 0-12-585050-6.
• Reed, Michael; Simon, Barry (1975), Fourier Analysis, Self-Adjointness, Methods of Modern Mathematical
Physics, Academic Press, ISBN 0-12-5850002-6.
• Riesz, Frigyes (1907), "Sur une espece de Geometrie analytique des systemes de fonctions sommables", C. R.
Acad. Sci. Paris 144: 1409-1411.
• Riesz, Frigyes (1934), "Zur Theorie des Hilbertschen Raumes", Acta Sci. Math. Szeged 7: 34-38.
• Riesz, Frigyes; Sz.-Nagy, Bela (1990), Functional analysis, Dover, ISBN 0-486-66289-6.
• Rudin, Walter (1973), Functional analysis, Tata MacGraw-Hill.
• Rudin, Walter (1987), Real and Complex Analysis, McGraw-Hill, ISBN 0-07-100276-6.
• Saks, Stanislaw (2005), Theory of the integral (2nd Dover ed.), Dover, ISBN 978-0486446486; originally
published Monografje Matematyczne, vol. 7, Warszawa, 1937.
• Schmidt, Erhard (1908), "Uber die Auflosung linearer Gleichungen mit unendlich vielen Unbekannten", Rend.
Circ. Mat. Palermo 25: 63-77, doi:10.1007/BF03029116.
• Shubin, M. A. (1987), Pseudo differential operators and spectral theory, Springer Series in Soviet Mathematics,
Berlin, New York: Springer- Verlag, MR883081, ISBN 978-3-540-13621-7.
• Sobrino, Luis (1996), Elements of non-relativistic quantum mechanics, River Edge, NJ: World Scientific
Publishing Co. Inc., MR1626401, ISBN 9789810223861.
• Stewart, James (2006), Calculus: Concepts and Contexts (3rd ed.), Thomson/Brooks/Cole.
Hilbert space
160
• Stein, E (1970), Singular Integrals and Differentiability Properties of Functions,, Princeton Univ. Press,
ISBN 0-691-08079-8.
• Stein, Elias; Weiss, Guido (1971), Introduction to Fourier Analysis on Euclidean Spaces, Princeton, N.J.:
Princeton University Press, ISBN 978-0-691-08078-9.
• Streater, Ray; Wightman, Arthur (1964), PCT, Spin and Statistics and All That, W. A. Benjamin, Inc.
• Titchmarsh, Edward Charles (1946), Eigenfunction expansions, part 1, Oxford University: Clarendon Press.
• Treves, Francois (1967), Topological Vector Spaces, Distributions and Kernels, Academic Press.
• von Neumann, John (1929), "Allgemeine Eigenwerttheorie Hermitescher Funktionaloperatoren", Mathematische
Annalen 102: 49-131, doi:10.1007/BF01782338.
• von Neumann, John (1932), "Physical Applications of the Ergodic Hypothesis" (http://www.jstor.org/stable/
86260), Proc Natl Acad Sci USA 18 (3): 263-266, doi: 10. 1073/pnas. 18.3.263, PMID 16587674, PMC 1076204.
• von Neumann, John (1955), Mathematical foundations of quantum mechanics, Princeton Landmarks in
Mathematics, Princeton University Press (published 1996), MR1435976, ISBN 978-0-691-02893-4.
• Warner, Frank (1983), Foundations of Differ entiable Manifolds and Lie Groups, Berlin, New York:
Springer- Verlag, ISBN 978-0-387-90894-6.
• Weidmann, Joachim (1980), Linear operators in Hilbert spaces, Graduate Texts in Mathematics, 68, Berlin, New
York: Springer- Verlag, MR566954, ISBN 978-0-387-90427-6.
• Weyl, Hermann (1931), The Theory of Groups and Quantum Mechanics (English 1950 ed.), Dover Press,
ISBN 0-486-60269-9.
• Young, N (1988), An introduction to Hilbert space, Cambridge University Press, ISBN 0-521-33071-8.
External links
• Hilbert Space at Mathworld (http://mathworld.wolfram.com/HilbertSpace.html)
• 245B, notes 5: Hilbert spaces (http://terrytao.wordpress.com/2009/01/17/254a-notes-5-hilbert-spaces/) by
Terence Tao
Von Neumann algebra
161
Von Neumann algebra
In mathematics, a von Neumann algebra or W*-algebra is a *-algebra of bounded operators on a Hilbert space that
is closed in the weak operator topology and contains the identity operator. They were originally introduced by John
von Neumann, motivated by the study of single operators, group representations, ergodic theory and quantum
mechanics. His double commutant theorem shows that the analytic definition is equivalent to a purely algebraic
definition as an algebra of symmetries.
Two basic examples of von Neumann algebras are as follows. The ring L°°(R) of essentially bounded measurable
functions on the real line is a commutative von Neumann algebra, which acts by pointwise multiplication on the
2
Hilbert space L (R) of square integrable functions. The algebra B{H) of all bounded operators on a Hilbert space H is
a von Neumann algebra, non-commutative if the Hilbert space has dimension at least 2.
Von Neumann algebras were first studied by von Neumann (1929); he and Francis Murray developed the basic
theory, under the original name of rings of operators, in a series of papers written in the 1930s and 1940s (F.J.
Murray & J. von Neumann 1936, 1937, 1943; J. von Neumann 1938, 1940, 1943, 1949), reprinted in the collected
works of von Neumann (1961).
Introductory accounts of von Neumann algebras are given in the online notes of Jones (2003) and Wassermann
(1991) and the books by Dixmier (1981), Schwartz (1967), Blackadar (2005) and Sakai (1971). The three volume
work by Takesaki (1979) gives an encyclopedic account of the theory. The book by Connes (1994) discusses more
advanced topics.
Definitions
There are three common ways to define von Neumann algebras.
The first and most common way is to define them as weakly closed * algebras of bounded operators (on a Hilbert
space) containing the identity. In this definition the weak (operator) topology can be replaced by many other
common topologies including the strong, ultrastrong or ultraweak operator topologies. The *-algebras of bounded
operators that are closed in the norm topology are C*-algebras, so in particular any von Neumann algebra is a
C*-algebra.
The second definition is that a von Neumann algebra is a subset of the bounded operators closed under * and equal to
its double commutant, or equivalently the commutant of some subset closed under *. The von Neumann double
commutant theorem (von Neumann 1929) says that the first two definitions are equivalent.
The first two definitions describe a von Neumann algebras concretely as a set of operators acting on some given
Hilbert space. Sakai (1971) showed that von Neumann algebras can also be defined abstractly as C*-algebras that
have a predual; in other words the von Neumann algebra, considered as a Banach space, is the dual of some other
Banach space called the predual. The predual of a von Neumann algebra is in fact unique up to isomorphism. Some
authors use "von Neumann algebra" for the algebras together with a Hilbert space action, and "W*-algebra" for the
abstract concept, so a von Neumann algebra is a W*-algebra together with a Hilbert space and a suitable faithful
unital action on the Hilbert space. The concrete and abstract definitions of a von Neumann algebra are similar to the
concrete and abstract definitions of a C*-algebra, which can be defined either as norm-closed * algebras of operators
on a Hilbert space, or as Banach *-algebras such that \\a a*\\=\\a\\
Von Neumann algebra
162
Terminology
Some of the terminology in von Neumann algebra theory can be confusing, and the terms often have different
meanings outside the subject.
• A factor is a von Neumann algebra with trivial center, i.e. a center consisting only of scalar operators.
• A finite von Neumann algebra is one which is the direct integral of finite factors. Similarly, properly infinite von
Neumann algebras are the direct integral of properly infinite factors.
• A von Neumann algebra that acts on a separable Hilbert space is called separable. Note that such algebras are
rarely separable in the norm topology.
• The von Neumann algebra generated by a set of bounded operators on a Hilbert space is the smallest von
Neumann algebra containing all those operators.
• The tensor product of two von Neumann algebras acting on two Hilbert spaces is defined to be the von
Neumann algebra generated by their algebraic tensor product, considered as operators on the Hilbert space tensor
product of the Hilbert spaces.
By forgetting about the topology on a von Neumann algebra, we can consider it a (unital) *-algebra, or just a ring.
Von Neumann algebras are semihereditary: every finitely generated submodule of a projective module is itself
projective. There have been several attempts to axiomatize the underlying rings of von Neumann algebras, including
Baer *-rings and AW* algebras. The *-algebra of affiliated operators of a finite von Neumann algebra is a von
Neumann regular ring. (The von Neumann algebra itself is in general not von Neumann regular.)
Commutative von Neumann algebras
Main article: Abelian von Neumann algebra
The relationship between commutative von Neumann algebras and measure spaces is analogous to that between
commutative C* -algebras and locally compact Hausdorff spaces. Every commutative von Neumann algebra is
isomorphic to L°°(X) for some measure space (X, \i) and conversely, for every a-finite measure space X, the * algebra
L°°(X) is a von Neumann algebra.
Due to this analogy, the theory of von Neumann algebras has been called noncommutative measure theory, while the
theory of C*-algebras is sometimes called noncommutative topology (Connes 1994).
Projections
Operators E in a von Neumann algebra for which E = EE = E* are called projections; they are exactly the operators
which give an orthogonal projection of H onto some closed subspace. A subspace of the Hilbert space H is said to
belong to the von Neumann algebra M if it is the image of some projection in M. Informally these are the closed
subspaces that can be described using elements of M, or that M "knows" about. The closure of the image of any
operator in M, or the kernel of any operator in M belong to M, and the closure of the image of any subspace
belonging to M under an operator of M also belongs to M. There is a 1:1 correspondence between projections of M
and subspaces that belong to it.
The basic theory of projections was worked out by Murray & von Neumann (1936). Two subspaces belonging to M
are called (Murray-von Neumann) equivalent if there is a partial isometry mapping the first isomorphically onto
the other that is an element of the von Neumann algebra (informally, if M "knows" that the subspaces are
isomorphic). This induces a natural equivalence relation on projections by defining E to be equivalent to F if the
corresponding subspaces are equivalent, or in other words if there is a partial isometry of H that maps the image of E
isometrically to the image of F and is an element of the von Neumann algebra. Another way of stating this is that E
is equivalent to F if E=uu* and F-uu for some partial isometry u in M.
The equivalence relation ~ thus defined is additive in the following sense: Suppose E^ ~ F^ and E^~ F^. If E^ J_
and F J_ F then E + E ~ F + F . This is not true in general if one requires unitary equivalence in the definition of
Von Neumann algebra
163
~, i.e. if we say E is equivalent to F if u*Eu = F for some unitary u. .
The subspaces belonging to M are partially ordered by inclusion, and this induces a partial order < of projections.
There is also a natural partial order on the set of equivalence classes of projections, induced by the partial order < of
projections. If M is a factor, < is a total order on equivalence classes of projections, described in the section on traces
below.
A projection (or subspace belonging to M) E is said to he finite if there is no projection F < E that is equivalent to E.
For example, all finite-dimensional projections (or subspaces) are finite (since isometries between Hilbert spaces
leave the dimension fixed), but the identity operator on an infinite-dimensional Hilbert space is not finite in the von
Neumann algebra of all bounded operators on it, since it is isometrically isomorphic to a proper subset of itself.
However it is possible for infinite dimensional subspaces to be finite.
Orthogonal projections are noncommutative analogues of indicator functions in L°°(R). L°°(R) is the ll ll^-closure of
the subspace generated by the indicator functions. Similarly, a von Neumann algebra is generated by its projections;
this is a consequence of the spectral theorem for self-adjoint operators.
Factors
A von Neumann algebra N whose center consists only of multiples of the identity operator is called a factor, von
Neumann (1949) showed that every von Neumann algebra on a separable Hilbert space is isomorphic to a direct
integral of factors. This decomposition is essentially unique. Thus, the problem of classifying isomorphism classes of
von Neumann algebras on separable Hilbert spaces can be reduced to that of classifying isomorphism classes of
factors.
Murray & von Neumann (1936) showed that every factor has one of 3 types as described below. The type
classification can be extended to von Neumann algebras that are not factors, and a von Neumann algebra is of type X
if it can be decomposed as a direct integral of type X factors; for example, every commutative von Neumann algebra
has type I . Every von Neumann algebra can be written uniquely as a sum of von Neumann algebras of types I, II,
and III.
There are several other ways to divide factors into classes that are sometimes used:
• A factor is called discrete (or occasionally tame) if it has type I, and continuous (or occasionally wild) if it has
type II or III.
• A factor is called semifinite if it has type I or II, and purely infinite if it has type III.
• A factor is called finite if the projection 1 is finite and properly infinite otherwise. Factors of types I and II may
be either finite or properly infinite, but factors of type III are always properly infinite.
Type I factors
A factor is said to be of type I if there is a minimal projection E # 0, i.e. a projection E such that there is no other
projection F with 0 < F < E. Any factor of type I is isomorphic to the von Neumann algebra of all bounded operators
on some Hilbert space; since there is one Hilbert space for every cardinal number, isomorphism classes of factors of
type I correspond exactly to the cardinal numbers. Since many authors consider von Neumann algebras only on
separable Hilbert spaces, it is customary to call the bounded operators on a Hilbert space of finite dimension n a
factor of type I , and the bounded operators on a separable infinite-dimensional Hilbert space, a factor of type I .
Von Neumann algebra
164
Type II factors
A factor is said to be of type II if there are no minimal projections but there are non-zero finite projections. This
implies that every projection E can be halved in the sense that there are equivalent projections F and G such that E =
F + G. If the identity operator in a type II factor is finite, the factor is said to be of type 11^ otherwise, it is said to be
of type 11^. The best understood factors of type II are the hyperfinite type II factor and the hyperfinite type 11^
factor, found by Murray & von Neumann (1936). These are the unique hyperfinite factors of types II 1 and IIj there
are an uncountable number of other factors of these types that are the subject of intensive study. Murray & von
Neumann (1937) proved the fundamental result that a factor of type II 1 has a unique finite tracial state, and the set of
traces of projections is [0,1].
A factor of type II has a semifinite trace, unique up to rescaling, and the set of traces of projections is [0,°o]. The set
of real numbers X such that there is an automorphism rescaling the trace by a factor of X is called the fundamental
group of the type II factor.
The tensor product of a factor of type II and an infinite type I factor has type 11^, and conversely any factor of type
11^ can be constructed like this. The fundamental group of a type II 1 factor is defined to be the fundamental group
of its tensor product with the infinite (separable) factor of type I. For many years it was an open problem to find a
type II factor whose fundamental group was not the group of all positive reals, but Connes then showed that the von
Neumann group algebra of a countable discrete group with Kazhdan's property T (the trivial representation is
isolated in the dual space), such as SL (Z), has a countable fundamental group. Subsequently Sorin Popa showed
that the fundamental group can be trivial for certain groups, including the semidirect product of Z by SL 2 (Z).
An example of a type II factor is the von Neumann group algebra of a countable infinite discrete group such that
every non-trivial conjugacy class is infinite. McDuff (1969) found an uncountable family of such groups with
non-isomorphic von Neumann group algebras, thus showing the existence of uncountably many different separable
type II 1 factors.
Type III factors
Lastly, type III factors are factors that do not contain any nonzero finite projections at all. In their first paper Murray
& von Neumann (1936) were unable decide whether or not they existed; the first examples were later found by von
Neumann (1940). Since the identity operator is always infinite in those factors, they were sometimes called type III^
in the past, but recently that notation has been superseded by the notation III., where X is a real number in the
interval [0,1]. More precisely, if the Connes spectrum (of its modular group) is 1 then the factor is of type III 0 , if the
Connes spectrum is all integral powers of X for 0 < X < 1 , then the type is III , and if the Connes spectrum is all
positive reals then the type is III^ (The Connes spectrum is a closed subgroup of the positive reals, so these are the
only possibilities.) The only trace on type III factors takes value °o on all non-zero positive elements, and any two
non-zero projections are equivalent. At one time type III factors were considered to be intractable objects, but
Tomita-Takesaki theory has led to a good structure theory. In particular, any type III factor can be written in a
canonical way as the crossed product of a type II factor and the real numbers.
Von Neumann algebra
165
The predual
Any von Neumann algebra M has a predual M^, which is the Banach space of all ultraweakly continuous linear
functionals on M. As the name suggests, Mis (as a Banach space) the dual of its predual. The predual is unique in the
sense that any other Banach space whose dual is M is canonically isomorphic to M^. Sakai (1971) showed that the
existence of a predual characterizes von Neumann algebras among C* algebras.
The definition of the predual given above seems to depend on the choice of Hilbert space that M acts on, as this
determines the ultraweak topology. However the predual can also be defined without using the Hilbert space that M
acts on, by defining it to be the space generated by all positive normal linear functionals on M. (Here "normal"
means that it preserves suprema when applied to increasing nets of self adjoint operators; or equivalently to
increasing sequences of projections.)
The predual is a closed subspace of the dual M (which consists of all norm-continuous linear functionals on M)
but is generally smaller. The proof that is (usually) not the same as M is nonconstructive and uses the axiom of
choice in an essential way; it is very hard to exhibit explicit elements of M that are not in M^. For example, exotic
positive linear forms on the von Neumann algebra f° (Z) are given by free ultrafilters; they correspond to exotic
*-homomorphisms into C and describe the Stone-Cech compactification of Z.
Examples:
1. The predual of the von Neumann algebra L°°(R) of essentially bounded functions on R is the Banach space L l (R)
of integrable functions. The dual of L°°(R) is strictly larger than L*(R) For example, a functional on L°°(R) that
extends the Dirac measure 6 Q on the closed subspace of bounded continuous functions C° b (R) cannot be
represented as a function in L l (R).
2. The predual of the von Neumann algebra B(H) of bounded operators on a Hilbert space H is the Banach space of
all trace class operators with the trace norm IIAII= Tr(IAI). The Banach space of trace class operators is itself the
dual of the C*-algebra of compact operators (which is not a von Neumann algebra).
Weights, states, and traces
Weights and their special cases states and traces are discussed in detail in (Takesaki 1979).
• A weight oo on a von Neumann algebra is a linear map from the set of positive elements (those of the form a a) to
[0,oo].
• A positive linear functional is a weight with co(l) finite (or rather the extension of oo to the whole algebra by
linearity).
• A state is a weight with co(l)=l.
• A trace is a weight with oo(aa )=oo(a a) for all a.
• A tracial state is a trace with co(l)=l.
Any factor has a trace such that the trace of a non-zero projection is non-zero and the trace of a projection is infinite
if and only if the projection is infinite. Such a trace is unique up to rescaling. For factors that are separable or finite,
two projections are equivalent if and only if they have the same trace. The type of a factor can be read off from the
possible values of this trace as follows:
• Type 1^: 0, x, 2x, ....,nx for some positive x (usually normalized to be l/n or 1).
• Type I : 0, x, 2x, ....,©<> for some positive x (usually normalized to be 1).
• Type 11^ [0,x] for some positive x (usually normalized to be 1).
• Type llj [0,oo].
• Type III: 0,«>.
If a von Neumann algebra acts on a Hilbert space containing a norm 1 vector v, then the functional a —> (av,v) is a
normal state. This construction can be reversed to give an action on a Hilbert space from a normal state: this is the
Von Neumann algebra
166
GNS construction for normal states.
Modules over a factor
Given an abstract separable factor, one can ask for a classification of its modules, meaning the separable Hilbert
spaces that it acts on. The answer is given as follows: every such module H can be given an M-dimension dim^(7f)
(not its dimension as a complex vector space) such that modules are isomorphic if and only if they have the same
M-dimension. The M-dimension is additive, and a module is isomorphic to a subspace of another module if and only
if it has smaller or equal M-dimension.
A module is called standard if it has a cyclic separating vector. Each factor has a standard representation, which is
unique up to isomorphism. The standard representation has an antilinear involution / such that JMJ = M'. For finite
factors the standard module is given by the GNS construction applied to the unique normal tracial state and the
M-dimension is normalized so that the standard module has M-dimension 1, while for infinite factors the standard
module is the module with M-dimension equal to ©o.
The possible M-dimensions of modules are given as follows:
• Type I (n finite): The M-dimension can be any of 0/n, 1/n, 21 n, 3/n, <*>. The standard module has M-dimension
n ^
1 (and complex dimension n .)
• Type I The M-dimension can be any of 0, 1, 2, 3, ©o. The standard representation of B(H) is H®H; its
M-dimension is °o.
• Type 11^: The M-dimension can be anything in [0, «>]. It is normalized so that the standard module has
M-dimension 1 . The M-dimension is also called the coupling constant of the module H.
• Type II Q : The M-dimension can be anything in [0, °o]. There is in general no canonical way to normalize it; the
factor may have outer automorphisms multiplying the M-dimension by constants. The standard representation is
the one with M-dimension ©o.
• Type III: The M-dimension can be 0 or ©©. Any two non-zero modules are isomorphic, and all non-zero modules
are standard.
Amenable von Neumann algebras
Connes (1976) and others proved that the following conditions on a von Neumann algebra M on a separable Hilbert
space H are all equivalent:
• M is hyperfinite or AFD or approximately finite dimensional or approximately finite: this means the algebra
contains an ascending sequence of finite dimensional subalgebras with dense union. (Warning: some authors use
"hyperfinite" to mean "AFD and finite".)
• M is amenable: this means that the derivations of M with values in a normal dual Banach bimodule are all inner.
• M has Schwartz's property P: for any bounded operator TonH the weak operator closed convex hull of the
elements uTu contains an element commuting with M.
• M is semidiscrete: this means the identity map from M to M is a weak pointwise limit of completely positive
maps of finite rank.
• M has property E or the Hakeda-Tomiyama extension property: this means that there is a projection of norm 1
from bounded operators on H to M '.
• M is injective: any completely positive linear map from any self adjoint closed subspace containing 1 of any
unital C -algebra A to M can be extended to a completely positive map from A to M.
There is no generally accepted term for the class of algebras above; Connes has suggested that amenable should be
the standard term.
Von Neumann algebra
167
The amenable factors have been classified: there is a unique one of each of the types 1,1 , IL , II , III , for 0<k< 1,
A J ^ n oo 1 oo a,
and the ones of type III 0 correspond to certain ergodic flows. (For type III 0 calling this a classification is a little
misleading, as it is known that there is no easy way to classify the corresponding ergodic flows.) The ones of type I
and II 1 were classified by Murray & von Neumann (1943), and the remaining ones were classified by Connes (1976),
except for the type III case which was completed by Haagerup.
All amenable factors can be constructed using the group-measure space construction of Murray and von Neumann
for a single ergodic transformation. In fact they are precisely the factors arising as crossed products by free ergodic
actions of Z or on abelian von Neumann algebras L°°(X). Type I factors occur when the measure space X is atomic
and the action transitive. When X is diffuse or non-atomic, it is equivalent to [0,1] as a measure space. Type II
factors occur when X admits an equivalent finite (II ) or infinite (11^ ) measure, invariant under Z • Type III factors
occur in the remaining cases where there is no invariant measure, but only an invariant measure class: these factors
are called Krieger factors.
Tensor products of von Neumann algebras
The Hilbert space tensor product of two Hilbert spaces is the completion of their algebraic tensor product. One can
define a tensor product of von Neumann algebras (a completion of the algebraic tensor product of the algebras
considered as rings), which is again a von Neumann algebra, and act on the tensor product of the corresponding
Hilbert spaces. The tensor product of two finite algebras is finite, and the tensor product of an infinite algebra and a
non-zero algebra is infinite. The type of the tensor product of two von Neumann algebras (I, II, or III) is the
maximum of their types. The commutation theorem for tensor products states that
(M ® N)' = M'®N'
(where M' denotes the commutant of M).
The tensor product of an infinite number of von Neumann algebras, if done naively, is usually a ridiculously large
non-separable algebra. Instead von Neumann (1938) showed that one should choose a state on each of the von
Neumann algebras, use this to define a state on the algebraic tensor product, which can be used to product a Hilbert
space and a (reasonably small) von Neumann algebra. Araki & Woods (1968) studied the case where all the factors
are finite matrix algebras; these factors are called Araki- Woods factors or ITPFI factors (ITPFI stands for "infinite
tensor product of finite type I factors"). The type of the infinite tensor product can vary dramatically as the states are
changed; for example, the infinite tensor product of an infinite number of type \^ factors can have any type
depending on the choice of states. In particular Powers (1967) found an uncountable family of non-isomorphic
hyperfinite type III factors for 0<X<1, called Powers factors, by taking an infinite tensor product of type I factors,
each with the state given by : x \— > Tr ( ^Tlr 1 ^ J x.
V 0 A+l/
All hyperfinite von Neumann algebras not of type are isomorphic to Araki- Woods factors, but there are
uncountably many of type III 0 that are not.
Bimodules and subfactors
A bimodule (or correspondence) is a Hilbert space H with module actions of two commuting von Neumann
algebras. Bimodules have a much richer structure than that of modules. Any bimodule over two factors always gives
a subfactor since one of the factors is always contained in the commutant of the other. There is also a subtle relative
tensor product operation due to Connes on bimodules. The theory of subfactors, initiated by Vaughan Jones,
reconciles these two seemingly different points of view.
Bimodules are also important for the von Neumann group algebra M of a discrete group P . Indeed if V is any
unitary representation of T, then, regarding fas the diagonal subgroup of Px T, the corresponding induced
2
representation on / ( F,V) is naturally a bimodule for two commuting copies of M. Important representation
Von Neumann algebra
168
theoretic properties of Tcan be formulated entirely in terms of bimodules and therefore make sense for the von
Neumann algebra itself. For example Connes and Jones gave a definition of an analogue of Kazhdan's Property T for
von Neumann algebras in this way.
Non-amenable factors
Von Neumann algebras of type I are always amenable, but for the other types there are an uncountable number of
different non-amenable factors, which seem very hard to classify, or even distinguish from each other. Nevertheless
Voiculescu has shown that the class of non-amenable factors coming from the group-measure space construction is
disjoint from the class coming from group von Neumann algebras of free groups. Later Narutaka Ozawa proved that
group von Neumann algebras of hyperbolic groups yield prime type II factors, i.e. ones that cannot be factored as
tensor products of type II 1 factors, a result first proved by Leeming Ge for free group factors using Voiculescu' s free
entropy. Popa's work on fundamental groups of non-amenable factors represents another significant advance. The
theory of factors "beyond the hyperfinite" is rapidly expanding at present, with many new and surprising results; it
has close links with rigidity phenomena in geometric group theory and ergodic theory.
Examples
• The essentially bounded functions on a a-finite measure space form a commutative (type I ) von Neumann
2
algebra acting on the L functions. For certain non-a-finite measure spaces, usually considered pathological,
L°°(X) is not a von Neumann algebra; for example, the a-algebra of measurable sets might be the
countable-cocountable algebra on an uncountable set.
• The bounded operators on any Hilbert space form a von Neumann algebra, indeed a factor, of type I.
• If we have any unitary representation of a group G on a Hilbert space H then the bounded operators commuting
with G form a von Neumann algebra G', whose projections correspond exactly to the closed subspaces of H
invariant under G. Equivalent subrepresentations correspond to equivalent projections in G'. The double
commutant G" of G is also a von Neumann algebra.
2
• The von Neumann group algebra of a discrete group G is the algebra of all bounded operators onH=l (G)
commuting with the action of G on H through right multiplication. One can show that this is the von Neumann
algebra generated by the operators corresponding to multiplication from the left with an element g G G. It is a
factor (of type 11^ if every non-trivial conjugacy class of G is infinite (for example, a non-abelian free group),
and is the hyperfinite factor of type II if in addition G is a union of finite subgroups (for example, the group of all
permutations of the integers fixing all but a finite number of elements).
• The tensor product of two von Neumann algebras, or of a countable number with states, is a von Neumann
algebra as described in the section above.
• The crossed product of a von Neumann algebra by a discrete (or more generally locally compact) group can be
defined, and is a von Neumann algebra. Special cases are the group-measure space construction of Murray and
von Neumann and Krieger factors.
• The von Neumann algebras of a measurable equivalence relation and a measurable groupoid can be defined.
These examples generalise von Neumann group algebras and the group-measure space construction.
Von Neumann algebra
169
Applications
Von Neumann algebras have found applications in diverse areas of mathematics like knot theory, statistical
mechanics, Quantum field theory, Local quantum physics, Free probability, Noncommutative geometry,
representation theory, geometry, and probability.
References
• Araki, H.; Woods, E. J. (1968), "A classification of factors", Publ Res. Inst. Math. Sci. Ser. A 4: 51-130,
doi: 10.2977/prims/l 195 195263MR0244773
• Blackadar, B. (2005), Operator algebras, Springer, ISBN 3-540-28486-9
• Connes, A. (1976), "Classification of Injective Factors" ^\ The Annals of Mathematics 2nd Ser. 104 (1): 73-115,
doi: 10.2307/1971057
• Connes, A. (1994), Non-commutative geometry [2] , Academic Press, ISBN 0-12-185860-X.
• Dixmier, J. (1981), Von Neumann algebras, ISBN 0-444-86308-7 (A translation of Dixmier, J. (1957), Les
algebres d'operateurs dans Vespace hilbertien: algebres de von Neumann, Gauthier-Villars, the first book about
von Neumann algebras.)
• Jones, V.F.R. (2003), von Neumann algebras ; incomplete notes from a course.
• McDuff, Dusa (1969), "Uncountably many n factors" [4] , Ann of Math. (Annals of Mathematics) 90 (2):
372-377, doi: 10.2307/1970730
• Murray, F. J., "The rings of operators papers", The legacy of John von Neumann (Hempstead, NY, 1988), Proc.
Sympos. Pure Math., 50, Providence, RL: Amer. Math. Soc, pp. 57-60, ISBN 0-8218-4219-6 A historical
account of the discovery of von Neumann algebras.
• Murray, F.J.; von Neumann, J. (1936), "On rings of operators" ^, Ann. Of Math. (2) (Annals of Mathematics) 37
(1): 116-229, doi: 10.2307/1968693. This paper gives their basic properties and the division into types I, II, and
III, and in particular finds factors not of type I.
• Murray, F.J.; von Neumann, J. (1937), "On rings of operators II" Trans. Amer. Math. Soc. (American
Mathematical Society) 41 (2): 208-248, doi: 10.2307/1989620. This is a continuation of the previous paper, that
studies properties of the trace of a factor.
• Murray, F.J.; von Neumann, J. (1943), "On rings of operators IV" , Ann. Of Math. (2) (Annals of Mathematics)
44 (4): 716-808, doi: 10.2307/1969107. This studies when factors are isomorphic, and in particular shows that all
approximately finite factors of type II are isomorphic.
• Powers, Robert T. (1967), "Representations of Uniformly Hyperfinite Algebras and Their Associated von
Neumann Rings" [8] , The Annals of Mathematics, 2nd Ser. 86 (1): 138-171, doi: 10.2307/1970364
• Sakai, S. (1971), C*-algebras and W*-algebras, Springer, ISBN 3-540-63633-1
• Schwartz, Jacob (1967), W-* Algebras, ISBN 0-677-00670-5
rm
• Shtern, A.I. (2001), "von Neumann algebra" , in Hazewinkel, Michiel, Encyclopaedia of Mathematics,
Springer, ISBN 978-1556080104
• Takesaki, M. (1979), Theory of Operator Algebras I, II, III, ISBN 3-540-42248-X ISBN 3-540-429 14-X ISBN
3-540-42913-1
• von Neumann, J. (1929), "Zur Algebra der Funktionaloperationen und Theorie der normalen Operatoren", Math.
Ann. 102: 370-427, doi:10.1007/BF01782352. The original paper on von Neumann algebras.
• von Neumann, J. (1936), "On a Certain Topology for Rings of Operators" The Annals of Mathematics 2nd
Ser. 37 (1): 111-115, doi: 10.2307/1968692. This defines the ultrastrong topology.
• von Neumann, J. (1938), "On infinite direct products" Compos. Math. 6: 1-77. This discusses infinite tensor
products of Hilbert spaces and the algebras acting on them.
• von Neumann, J. (1940), "On rings of operators III" [12] , Ann. Of Math. (2) (Annals of Mathematics) 41 (1):
94-161, doi: 10.2307/1968823. This shows the existence of factors of type III.
Von Neumann algebra
170
• von Neumann, J. (1943), "On Some Algebraical Properties of Operator Rings" , The Annals of Mathematics
2nd Ser. 44 (4): 709-715, doi: 10.2307/1969106. This shows that some apparently topological properties in von
Neumann algebras can be defined purely algebraically.
• von Neumann, J. (1949), "On Rings of Operators. Reduction Theory" The Annals of Mathematics 2nd Ser. 50
(2): 401-485, doi: 10.2307/1969463. This discusses how to write a von Neumann algebra as a sum or integral of
factors.
• von Neumann, John (1961), Taub, A.H., ed., Collected Works, Volume III: Rings of Operators, NY: Pergamon
Press. Reprints von Neumann's papers on von Neumann algebras.
• Wassermann, A. J. (1991), Operators on Hilbert space ^
References
[I] http://linksjstor.org/sici?sici=0003-486X%28197607%292%3A104%3Al%3C73%3ACOIFC%3E2.0.CO%^
[2] http : / / www . alainconnes . org/ doc s/book94bigpdf . pdf
[3] http://www.math.berkeley.edu/~vfr/MATH20909/VonNeumann2009.pdf
[4] http://links.jstor.org/sici?sici=0003-486X%28196909%292%3A90%3A2%3C372%3AUMIF%3E2.0.CO%3B2-C
[5] http://links.jstor.org/sici?sici=0003-486X%28193601%292%3A37%3Al%3C116%3AOROO%3E2.0.CO%3B2-Y
[6] http://links.jstor.org/sici?sici=0002-9947%28193703%2941%3A2%3C208%3AOROOI%3E2.0.CO%3B2-9
[7] http://links.jstor.org/sici?sici=0003-486X%28194310%292%3A44%3A4%3C716%3AOROOI%3E2.0.CO%3B2-O
[8] http://links.jstor.org/sici?sici=0003-486X%28196707%292%3A86%3Al%3C138%3AROUHAA%3E2.0.CO%3B2-6
[9] http://eom.springer.de/V/v096900.htm
[10] http://links.jstor.org/sici?sici=0003-486X%28193601%292%3A37%3Al%3Clll%3AOACTFR%3E2.0.TO
[II] http://www.numdam.org/item?id=CM_1939__6__l_0
[12] http://links.jstor.org/sici?sici=0003-486X%28194001%292%3A41%3Al%3C94%3AOROOI%3E2.0.CO%3B2-U
[13] http://links.jstor.org/sici?sici=0003-486X%28194310%292%3A44%3A4%3C709%3AOSAPOO%3E2.0.CO%3B2-L
[14] http://links.jstor.org/sici?sici=0003-486X%28194904%292%3A50%3A2%3C401%3AOROORT%3E2.0.CO%3B2-H
[15] http :// iml. univ-mrs . fir/ ~ wasserm/ OHS . ps
C*-algebra
171
C*-algebra
C*-algebras (pronounced "C-star") are an important area of research in functional analysis, a branch of
mathematics. The prototypical example of a C*-algebra is a complex algebra A of linear operators on a complex
Hilbert space with two additional properties:
• A is a topologically closed set in the norm topology of operators.
• A is closed under the operation of taking adjoints of operators.
It is generally believed that C*-algebras were first considered primarily for their use in quantum mechanics to model
algebras of physical observables. This line of research began with Werner Heisenberg's matrix mechanics and in a
more mathematically developed form with Pascual Jordan around 1933. Subsequently John von Neumann attempted
to establish a general framework for these algebras which culminated in a series of papers on rings of operators.
These papers considered a special class of C*-algebras which are now known as von Neumann algebras.
Around 1943, the work of Israel Gelfand and Mark Naimark yielded an abstract characterisation of C*-algebras
making no reference to operators.
C*-algebras are now an important tool in the theory of unitary representations of locally compact groups, and are
also used in algebraic formulations of quantum mechanics.
Abstract characterization
We begin with the abstract characterization of C*-algebras given in the 1943 paper by Gelfand and Naimark.
A C*-algebra, A, is a Banach algebra over the field of complex numbers, together with a map, * : A — » A, called an
involution. The image of an element x of A under the involution is written x*. Involution has the following
properties:
• For all x, y in A:
(x + yY = x* + y*
(xy)* = y*x*
• For every X in C and every x in A:
(\x)* = Ax*.
• For all x in A
{x*y = x
• The C*-identity holds for all x in A:
||x*x|| = ||x||||x*||.
Note that the C* identity is equivalent to: for all x in A:
\\xx*\\ = ||x||||x*||.
This relation is equivalent to ||xx*|| = ||x|| 2 , which is sometimes called the B*-identity. For history behind the
names C*- and B*-algebras, see the history section below.
The C* -identity is a very strong requirement. For instance, together with the spectral radius formula, it implies the
C*-norm is uniquely determined by the algebraic structure:
||x|| 2 = ||x*x|| = sup{|A| : x*x — A 1 is not invertible}.
A bounded linear map, it : A — » B, between C*-algebras A and B is called a *-homomorphism if
• For x and y in A
7r(xy) = 7r(x)7r(y)
C*-algebra
172
• For x in A
<jr{x*) = 7t(x)*
In the case of C*-algebras, any *-homomorphism it between C*-algebras is non-expansive, i.e. bounded with norm <
1. Furthermore, an injective *-homomorphism between C*-algebras is isometric. These are consequences of the
C* -identity.
A bijective *-homomorphism jt is called a C* -isomorphism, in which case A and B are said to be isomorphic.
Some history: B*-algebras and C*-algebras
The term B*-algebra was introduced by C. E. Rickart in 1946 to describe Banach *-algebras that satisfy the
condition
2
• IIjc jc*II = IIjcII for all x in the given B* -algebra. (B* -condition)
This condition automatically implies that the ^-involution is isometric, that is, = IIjcII. Hence \\x x*ll = llxll ILx*ll,
and therefore, a B* -algebra is a C* -algebra. Conversely, the C* -condition implies the B* -condition. This is
nontrivial, and can be proved without using the condition llxll=llx*ll. (For details, see R. S. Doran, V. A. Belfi,
Characterizations of C*- Algebras — the Gelfand-Naimark Theorems, CRC, 1986.)
For these reasons, the term B*-algebra is rarely used in current terminology, and has been replaced by the term 'C*
algebra'.
The term C*-algebra was introduced by I. E. Segal in 1947 to describe norm-closed subalgebras of B(H), namely,
the space of bounded operators on some Hilbert space H. 'C stood for 'closed'.
Examples
Finite-dimensional C* -algebras
The algebra M n (C) of n-by-n matrices over C becomes a C* -algebra if we consider matrices as operators on the
Euclidean space, C n , and use the operator norm 11.11 on matrices. The involution is given by the conjugate transpose.
More generally, one can consider finite direct sums of matrix algebras. In fact, all finite dimensional C*-algebras are
of this form. The self-adjoint requirement means finite-dimensional C*-algebras are semisimple, from which fact
one can deduce the following theorem of Artin-Wedderburn type:
Theorem. A finite-dimensional C*-algebra, A, is canonically isomorphic to a finite direct sum
A= 0 Ae
eEmin A
where min A is the set of minimal nonzero self-adjoint central projections of A.
Each C*-algebra, Ae, is isomorphic (in a noncanonical way) to the full matrix algebra M dim ( e )(Q- The finite family
indexed on min A given by {dim(e)} e is called the dimension vector of A. This vector uniquely determines the
isomorphism class of a finite-dimensional C*-algebra.
C*-algebra
173
C* -algebras of operators
The prototypical example of a C*-algebra is the algebra B(H) of bounded (equivalently continuous) linear operators
defined on a complex Hilbert space H; here x* denotes the adjoint operator of the operator x : H — » H. In fact, every
C*-algebra, A, is * -isomorphic to a norm-closed adjoint closed subalgebra of B(H) for a suitable Hilbert space, H\
this is the content of the Gelfand-Naimark theorem.
Commutative C* -algebras
Let X be a locally compact Hausdorff space. The space C Q (X) of complex- valued continuous functions on X that
vanish at infinity (defined in the article on local compactness) form a commutative C*-algebra C (X) under
pointwise multiplication and addition. The involution is pointwise conjugation. C Q (X) has a multiplicative unit
element if and only if X is compact. As does any C*-algebra, C (X) has an approximate identity. In the case of C (X)
this is immediate: consider the directed set of compact subsets of X, and for each compact K let f be a function of
compact support which is identically 1 on K. Such functions exist by the Tietze extension theorem which applies to
locally compact Hausdorff spaces, {flis an approximate identity.
K K
The Gelfand representation states that every commutative C*-algebra is ^-isomorphic to the algebra C Q (X), where X
is the space of characters equipped with the weak* topology. Furthermore if C (X) is isomorphic to C (Y) as
C*-algebras, it follows that X and Y are homeomorphic. This characterization is one of the motivations for the
noncommutative topology and noncommutative geometry programs.
C*-algebras of compact operators
Let H be a separable infinite-dimensional Hilbert space. The algebra K(H) of compact operators on H is a norm
closed subalgebra of B(H). It is also closed under involution; hence it is a C*-algebra.
Concrete C*-algebras of compact operators admit a characterization similar to Wedderburn's theorem for finite
dimensional C*-algebras.
Theorem. If A is a C* -subalgebra of K(H), then there exists Hilbert spaces {H.} . e l such that A is isomorphic to the
following direct sum
iei
where the (C*-)direct sum consists of elements (7\) of the Cartesian product n K(H) with 117.11 — » 0.
Though K(H) does not have an identity element, a sequential approximate identity for K(H) can be easily displayed.
To be specific, H is isomorphic to the space of square summable sequences / ; we may assume that
H = e.
2
For each natural number n let be the subspace of sequences of / which vanish for indices
k > n
and let
be the orthogonal projection onto H^. The sequence {e^t n is an approximate identity for K(H).
K(H) is a two-sided closed ideal of B(H). For separable Hilbert spaces, it is the unique ideal. The quotient of B(H) by
K(H) is the Calkin algebra.
C*-algebra
174
C* -enveloping algebra
Given a B*-algebra A with an approximate identity, there is a unique (up to C* -isomorphism) C*-algebra E(A) and
*-morphism jt from A into E(A) which is universal, that is, every other B*-morphism jt ' : A — » 5 factors uniquely
through jt. The algebra E(A) is called the C*-enveloping algebra of the B*-algebra A.
Of particular importance is the C*-algebra of a locally compact group G. This is defined as the enveloping
C*-algebra of the group algebra of G. The C*-algebra of G provides context for general harmonic analysis of G in
the case G is non-abelian. In particular, the dual of a locally compact group is defined to be the primitive ideal space
of the group C*-algebra. See spectrum of a C*-algebra.
von Neumann algebras
von Neumann algebras, known as W* algebras before the 1960s, are a special kind of C*-algebra. They are required
to be closed in the weak operator topology, which is weaker than the norm topology. Their study is a specialized area
of functional analysis.
Properties of C*-algebras
C*-algebras have a large number of properties that are technically convenient. These properties can be established by
use the continuous functional calculus or by reduction to commutative C*-algebras. In the latter case, we can use the
fact that the structure of these is completely determined by the Gelfand isomorphism.
• The set of elements of a C*-algebra A of the form x*x forms a closed convex cone. This cone is identical to the
elements of the form x x*. Elements of this cone are called non-negative (or sometimes positive, even though this
terminology conflicts with its use for elements of R.)
• The set of self-adjoint elements of a C*-algebra A naturally has the structure of a partially ordered vector space;
the ordering is usually denoted >. In this ordering, a self-adjoint element x of A satisfies x > 0 if and only if the
spectrum of x is non-negative. Two self-adjoint elements x and y of A satisfy x > y if x - y > 0.
• Any C*-algebra A has an approximate identity. In fact, there is a directed family {e^}^ g j of self-adjoint elements
of A such that
xe\ — ► x
0 < e x < < 1 whenever A < fi.
In case A is separable, A has a sequential approximate identity. More generally, A will have a sequential
approximate identity if and only if A contains a strictly positive element, i.e. a positive element h such that
hAh is dense in A.
• Using approximate identities, one can show that the algebraic quotient of a C* -algebra by a closed proper
two-sided ideal, with the natural norm, is a C*-algebra.
• Similarly, a closed two-sided ideal of a C*-algebra is itself a C*-algebra.
C*-algebra
175
Type for C*-algebras
A C*-algebra A is of type I if and only if for all non-degenerate representations jt of A the von Neumann algebra
7t(A)" (that is, the bicommutant of Jt(A)) is a type I von Neumann algebra. In fact it is sufficient to consider only
factor representations, i.e. representations jt for which jt(A)" is a factor.
A locally compact group is said to be of type I if and only if its group C*-algebra is type I.
However, if a C*-algebra has non-type I representations, then by results of James Glimm it also has representations
of type II and type III. Thus for C*-algebras and locally compact groups, it is only meaningful to speak of type I and
non type I properties.
C*-algebras and quantum field theory
In quantum field theory, one typically describes a physical system with a C* -algebra A with unit element; the
self-adjoint elements of A (elements x with x* = x) are thought of as the observables, the measurable quantities, of
the system. A state of the system is defined as a positive functional on A (a C-linear map cp : A — > C with cp(w* u) >
0 for all wGA) such that (p(l) = 1. The expected value of the observable x, if the system is in state cp, is then cp(x).
See Local quantum physics.
References
• W. Arveson, An Invitation to C*-Algebra, Springer- Verlag, 1976. ISBN 0-387-90176-0. An excellent
introduction to the subject, accessible for those with a knowledge of basic functional analysis.
• A. Connes, Non-commutative geometry (http://www.alainconnes.org/docs/book94bigpdf.pdf), ISBN
0-12-185860-X. This book is widely regarded as a source of new research material, providing much supporting
intuition, but it is difficult.
• J. Dixmier, Les C*-algebres et leurs representations, Gauthier-Villars, 1969. ISBN 0-7204-0762-1. This is a
somewhat dated reference, but is still considered as a high-quality technical exposition. It is available in English
from North Holland press.
• G. Emch, Algebraic Methods in Statistical Mechanics and Quantum Field Theory, Wiley-Interscience, 1972.
ISBN 0-471-23900-3. Mathematically rigorous reference which provides extensive physics background.
• A.I. Shtern (2001), M C* algebra" (http://eom. springer. de/c/c020020. htm), in Hazewinkel, Michiel,
Encyclopaedia of Mathematics, Springer, ISBN 978-1556080104
• S. Sakai, C*-algebras and W*-algebras , Springer (1971) ISBN 3-540-63633-1
Kac-Moody algebra
176
Kac-Moody algebra
In mathematics, a Kac-Moody algebra (named for Victor Kac and Robert Moody, who independently discovered
them) is a Lie algebra, usually infinite-dimensional, that can be defined by generators and relations through a
generalized Cartan matrix. These algebras form a generalization of finite-dimensional semisimple Lie algebras, and
many properties related to the structure of a Lie algebra such as its root system, irreducible representations, and
connection to flag manifolds have natural analogues in the Kac-Moody setting. A class of Kac-Moody algebras
called affine Lie algebras is of particular importance in mathematics and theoretical physics, especially conformal
field theory and the theory of exactly solvable models. Kac discovered an elegant proof of certain combinatorial
identities, Macdonald identities, which is based on the representation theory of affine Kac-Moody algebras. Garland
and Lepowski demonstrated that Rogers-Ramanujan identities can be derived in a similar fashion.
Definition
A Kac-Moody algebra is given by the following:
1. An nxn generalized Cartan matrix C = (c.) of rank r.
2. A vector space {) over the complex numbers of dimension In - r.
3. A set of n linearly independent elements on of {) and a set of n linearly independent elements a* of the dual
space, such that = Qj . The on are known as coroots, while the a* are known as roots.
The Kac-Moody algebra is the Lie algebra 0 defined by generators and fi and the elements of {) and relations
[^ij fi\
• [ e U fj] = 0 f or * ^ 3
• [e;, x] = a*(x)ei , for x £ \)
' [fi,x] = -<**(*)/;, for x e f)
• [x, x'] = Ofor x,x' G fj
• ad(e i ) 1 -^(e j ) = 0
• ad(/ i ) 1 -^(/ i ) = 0
where ad : g — > End(fl), ad(x)(y) = [x, y]is the adjoint representation of 0.
A real (possibly infinite-dimensional) Lie algebra is also considered a Kac-Moody algebra if its complexification is
a Kac-Moody algebra.
Interpretation
f) is a Cartan subalgebra of the Kac-Moody algebra.
If g is an element of the Kac-Moody algebra such that
Vx€ f), [g,x] =u){x)g
where oo is an element of {)* , then g is said to have weight oo. The Kac-Moody algebra can be diagonalized into
weight eigenvectors. The Cartan subalgebra h has weight zero, e. has weight a* and/, has weight -a*.. If the Lie
bracket of two weight eigenvectors is nonzero, then its weight is the sum of their weights. The condition
[e^ 3 fj] = Ofor i ^ j simply means the a* are simple roots.
Kac-Moody algebra
177
Types of Kac-Moody algebras
Properties of a Kac-Moody algebra are controlled by the algebraic properties of its generalized Cartan matrix C. In
order to classify Kac-Moody algebras, it is enough to consider the case of an indecomposable matrixC, that is,
assume that there is no decomposition of the set of indices / into a disjoint union of non-empty subsets / and / such
that C.j = 0 for all i in 1^ and j in 1^. Any decomposition of the generalized Cartan matrix leads to the direct sum
decomposition of the corresponding Kac-Moody algebra:
fl (C)~ fl(Ci)0 0(C 2 ),
where the two Kac-Moody algebras in the right hand side are associated with the submatrices of C corresponding to
the index sets 1^ and l^.
An important subclass of Kac-Moody algebras corresponds to symmetrizable generalized Cartan matrices C, which
can be decomposed as DS, where D is a diagonal matrix with positive integer entries and S is a symmetric matrix.
Under the assumptions that C is symmetrizable and indecomposable, the Kac-Moody algebras are divided into three
classes:
• A positive definite matrix S gives rise to a finite-dimensional simple Lie algebra.
• A positive semidefinite matrix S gives rise to an infinite-dimensional Kac-Moody algebra of affine type, or an
affine Lie algebra.
• An indefinite matrix S gives rise to a Kac-Moody algebra of indefinite type.
• Since the diagonal entries of C and S are positive, S cannot be negative definite or negative semidefinite.
Symmetrizable indecomposable generalized Cartan matrices of finite and affine type have been completely
classified. They correspond to Dynkin diagrams and affine Dynkin diagrams. Very little is known about the
Kac-Moody algebras of indefinite type. Among those, the main focus has been on the (generalized) Kac-Moody
algebras of hyperbolic type, for which the matrix S is indefinite, but for each proper subset of /, the corresponding
submatrix is positive definite or positive semidefinite. Such matrices have rank at most 10 and have also been
completely determined.
References
• A. J. Wassermann, Lecture notes on Kac-Moody and Virasoro algebras ^
• V. Kac, Infinite dimensional Lie algebras ISBN 0521466938
• Hazewinkel, Michiel, ed. (2001), "Kac-Moody algebra" , Encyclopaedia of Mathematics, Springer,
ISBN 978-1556080104
• V.G. Kac, Simple irreducible graded Lie algebras of finite growth Math. USSR Izv., 2 (1968) pp. 1271-1311,
Izv. Akad. Nauk USSR Ser. Mat., 32 (1968) pp. 1923-1967
• R.V. Moody, A new class of Lie algebras J. of Algebra, 10 (1968) pp. 211-230
External links
• http ://w w w . emis . de/j ournals/SIGM A/Kac-Moody_algebras . html
References
[1] http://arxiv.org/abs/1004. 1287
[2] http://eom. springer. de/K/k055050.htm
Spectral theory
178
Spectral theory
In mathematics, spectral theory is an inclusive term for theories extending the eigenvector and eigenvalue theory of
a single square matrix to a much broader theory of the structure of operators in a variety of mathematical spaces J ^ It
is a result of studies of linear algebra and the solutions of systems of linear equations and their generalizations. The
theory is connected to that of analytic functions because the spectral properties of an operator are related to analytic
functions of the spectral parameter.
Mathematical background
The name spectral theory was introduced by David Hilbert in his original formulation of Hilbert space theory, which
was cast in terms of quadratic forms in infinitely many variables. The original spectral theorem was therefore
conceived as a version of the theorem on principal axes of an ellipsoid, in an infinite-dimensional setting. The later
discovery in quantum mechanics that spectral theory could explain features of atomic spectra was therefore
fortuitous.
There have been three main ways to formulate spectral theory, all of which retain their usefulness. After Hilbert's
initial formulation, the later development of abstract Hilbert space and the spectral theory of a single normal operator
on it did very much go in parallel with the requirements of physics; particularly at the hands of von Neumann ^ The
further theory built on this to include Banach algebras, which can be given abstractly. This development leads to the
Gelfand representation, which covers the commutative case, and further into non-commutative harmonic analysis.
The difference can be seen in making the connection with Fourier analysis. The Fourier transform on the real line is
in one sense the spectral theory of differentiation qua differential operator. But for that to cover the phenomena one
has already to deal with generalized eigenfunctions (for example, by means of a rigged Hilbert space). On the other
hand it is simple to construct a group algebra, the spectrum of which captures the Fourier transform's basic
properties, and this is carried out by means of Pontryagin duality.
One can also study the spectral properties of operators on Banach spaces. For example, compact operators on Banach
spaces have many spectral properties similar to that of matrices.
Physical background
The background in the physics of vibrations has been explained in this way:^
Spectral theory is connected with the investigation of localized vibrations of a variety of different objects, from atoms and molecules in
chemistry to obstacles in acoustic waveguides. These vibrations have frequencies, and the issue is to decide when such localized vibrations
occur, and how to go about computing the frequencies. This is a very complicated problem since every object has not only a fundamental tone
but also a complicated series of overtones, which vary radically from one body to another.
The mathematical theory is not dependent on such physical ideas on a technical level, but there are examples of
mutual influence (see for example Mark Kac's question Can you hear the shape of a drum?). Hilbert's adoption of
the term "spectrum" has been attributed to an 1897 paper of Wilhelm Wirtinger on Hill's equation (by Jean
Dieudonne), and it was taken up by his students during the first decade of the twentieth century, among them Erhard
Schmidt and Hermann Weyl. The conceptual basis for Hilbert space was developed from Hilbert's ideas by Erhard
Schmidt and Frigyes Riesz. [6] [7] It was almost twenty years later, when quantum mechanics was formulated in terms
of the Schrodinger equation, that the connection was made to atomic spectra; a connection with the mathematical
physics of vibration had been suspected before, as remarked by Henri Poincare, but rejected for simple quantitative
reasons, absent an explanation of the B aimer series. The later discovery in quantum mechanics that spectral theory
could explain features of atomic spectra was therefore fortuitous, rather than being an object of Hilbert's spectral
theory.
Spectral theory
179
A definition of spectrum
Consider a bounded linear transformation T defined everywhere over a general Banach space. We form the
transformation:
i? c = (C / - T)- 1 .
Here / is the identity operator and E, is a complex number. The inverse of an operator T, that is T~\ is defined by:
rp rp— 1 rji—1 rp j
If the inverse exists, Tis called regular. If it does not exist, Tis called singular.
With these definitions, the resolvent set of T is the set of all complex numbers E, such that R exists and is bounded.
This set often is denoted as q(T). The spectrum of T is the set of all complex numbers C, such that R^ fails to exist or
is unbounded. Often the spectrum of T is denoted by a(T). The function R for all t> m P(D (that is, wherever R
exists) is called the resolvent of T. The spectrum of T is therefore the complement of the resolvent set of T in the
complex plane P^ Every eigenvalue of T belongs to a(T), but a(T) may contain non-eigenvalues J 10]
This definition applies to a Banach space, but of course other types of space exist as well, for example, topological
vector spaces include Banach spaces, but can be more general J 1 ^ ^ On the other hand, Banach spaces include
ri3i
Hilbert spaces, and it is these spaces that find the greatest application and the richest theoretical results. With
suitable restrictions, much can be said about the structure of the spectra of transformations in a Hilbert space. In
particular, for self-adjoint operators, the spectrum lies on the real line and (in general) is a spectral combination of a
point spectrum of discrete eigenvalues and a continuous spectrum P 4 ^
What is spectral theory, roughly speaking?
In functional analysis and linear algebra the spectral theorem establishes conditions under which an operator can be
expressed in simple form as a sum of simpler operators. As a full rigorous presentation is not appropriate for this
article, we take an approach that avoids much of the rigor and satisfaction of a formal treatment with the aim of
being more comprehensible to a non- specialist.
This topic is easiest to describe by introducing the bra-ket notation of Dirac for operators/ 15 ^ ^ As an example, a
ri7i risi
very particular linear operator L might be written as a dyadic product:
L=\k 1 ){b 1 \,
in terms of the "bra" ( b and the "ket" k ) . A function /is described by a ket as / ) . The function fix) defined
on the coordinates (x jy x^... ) is denoted as:
/(*) = /) ,
and the magnitude of /by:
(/, f)= Jdx {/, x)(x, f)= jdx r(x)f(x) ,
where the notation '*' denotes a complex conjugate. This inner product choice defines a very specific inner product
ri3i
space, restricting the generality of the arguments that follow.
The effect of L upon a function /is then described as:
L\f) = \k 1 ){b,\f)
expressing the result that the effect of L on / is to produce a new function | k\ ) multiplied by the inner product
represented by •
A more general linear operator L might be expressed as:
L = AildX/il + AaleaX/al + A 3 |e 3 ></ 3 | + . . . ,
where the { X. } are scalars and the { |e^) } are a basis and the { } a reciprocal basis for the space. The
relation between the basis and the reciprocal basis is described, in part, by:
Spectral theory
180
(fi\ e j) = S ij
If such a formalism applies, the { } are eigenvalues of L and the functions { | e^) } are eigenfunctions of L.
ri9i
The eigenvalues are in the spectrum of L.
Some natural questions are: under what circumstances does this formalism work, and for what operators L are
expansions in series of other operators like this possible? Can any function / be expressed in terms of the
eigenfunctions (are they a complete set) and under what circumstances does a point spectrum or a continuous
spectrum arise? How do the formalisms for infinite dimensional spaces and finite dimensional spaces differ, or do
they differ? Can these ideas be extended to a broader class of spaces? Answering such questions is the realm of
spectral theory and requires considerable background in functional analysis and matrix algebra.
Resolution of the identity
This section continues in the rough and ready manner of the above section using the bra-ket notation, and glossing
over the many important and fascinating details of a rigorous treatment P^ A rigorous mathematical treatment may
be found in various references. 1
Using the bra-ket notation of the above section, the identity operator may be written as:
i=i
where it is supposed as above that { \q) } are a basis and the { (f { \ } a reciprocal basis for the space satisfying
the relation:
{fi\ e j)
This expression of the identity operation is called a representation or a resolution of the identity P^ This formal
representation satisfies the basic property of the identity:
I n = I
valid for every positive integer n.
Applying the resolution of the identity to any function in the space , one obtains:
WHV>) = El e ;></#> = £*>
i=l i=l
[221
which is the generalized Fourier expansion of ip in terms of the basis functions { e }.
Given some operator equation of the form:
o\+) = \h)
with h in the space, this equation can be solved in the above basis through the formal manipulations:
o|^> = E^(°l«*» = El c *>(/*l fc >.
i=i i=i
n ri
(fjW) = 5></il°l*> = B/ilesX/ilfc) = {fj\h), Vj
i=l 1=1
which converts the operator equation to a matrix equation determining the unknown coefficients c in terms of the
generalized Fourier coefficients (fj\h) ofh and the matrix elements Oji = (fj\0\e{) of the operator O.
The role of spectral theory arises in establishing the nature and existence of the basis and the reciprocal basis. In
particular, the basis might consist of the eigenfunctions of some linear operator L:
L\ei) = \i\ei) ;
with the { X. } the eigenvalues of L from the spectrum of L. Then the resolution of the identity above provides the
dyad expansion of L:
Spectral theory
181
LI = L = J2L\e i )(fi\ = E ^iHfil
i=i i=i
Resolvent operator
Using spectral theory, the resolvent operator R:
R=(X-L)-\
can be evaluated in terms of the eigenf unctions and eigenvalues of L, and the Green's function corresponding to L
can be found.
Applying R to some arbitrary function in the space, say cp,
R\<p) = {\-L)-' \v) = Y»^ 1 ±^\e i ){f h <p).
This function has poles in the complex A-plane at each eigenvalue of L. Thus, using the calculus of residues:
1
2m h d\{\ - L)~ l \<p) = -E? =1 \et) (f h <p) =
where the line integral is over a contour C that includes all the eigenvalues of L.
Suppose our functions are defined over some coordinates { x }, that is:
<z, if) = <p(z u x 2 ,... ),
where the bra-kets corresponding to { x } satisfy.
(x, y) =6(x-y),
and where 6 (x - y) = 6 (x ] - y Jt x 2 - x 3 - ...) is the Dirac delta function. 174 ^
Then:
(*• h £ « A < A - L '" v ) = h £ iX <*■ < A - ">
= h £ iX j dy {x ' (A " y) {y ' v)
The function G(x, y; X) defined by:
G(x, y; A) = (x, (A - L) _1 y)
= E? =1 E? =1 <x, «)</,, (A - L)~ 1 ej)(fj, y)
(x, ei)(fi, y)
— ^1=1
— ^1=1
A-A;
*(*)tf(v)
A — Aj '
T251
is called the Green's function for operator L, and satisfies:
^- £ dA G(x, y; A) = -E? =1 (z, y) = -(x, y) = -S(x - y).
Spectral theory
182
Operator equations
Consider the operator equation:
(O-XI) \4>) = \h);
in terms of coordinates:
J dy(x, (0-XI)y)(y, ^)=h{x).
A particular case is X = 0.
The Green's function of the previous section is:
(y, G(X)z) = (y, (O - A/)" 1 *) = G(y, z; X) ,
and satisfies:
J dy{x,(0- XI) y)(y, G(X)z) = f dy(x,(0- XI) y)(y, (O - A/)" 1 z) = (x, z) = S(x - z) .
Using this Green's function property:
J dy{x,(0- XI) y)G(y, z; A) = 5{x - z) .
Then, multiplying both sides of this equation by h(z) and integrating:
J dz h(z) j dy(x, (O- XI) y) G(y, z; A) = J dy (x, ( O - XI) y) J dz h(z) G(y, z; X) = h
which suggests the solution is:
V>0) = J dz h(z) G(x, z; A).
That is, the function ip(x) satisfying the operator equation is found if we can find the spectrum of O, and construct G,
for example by using:
There are many other ways to find G, of course P 6 ^ See the articles on Green's functions and on Fredholm integral
equations. It must be kept in mind that the above mathematics is purely formal, and a rigorous treatment involves
some pretty sophisticated mathematics, including a good background knowledge of functional analysis, Hilbert
spaces, distributions and so forth. Consult these articles and the references for more detail.
Notes
[1] Jean Alexandre Dieudonne (1981). History of functional analysis (http://books. google, com/books ?id=mg7r4acKgqOC&
printsec=frontcover). Elsevier. ISBN 0444861483. .
[2] William Arveson (2002). "Chapter 1: spectral theory and Banach algebras" (http://books. google. com/books ?id=ARdehHGWVlQC). A
short course on spectral theory. Springer. ISBN 0387953000. .
[3] Viktor Antonovich Sadovnichii (1991). "Chapter 4: The geometry of Hilbert space: the spectral theory of operators" (http://books. google.
com/books ?id=SRlQkG60kVEC&pg=PA181&lpg=PA181). Theory of Operators. Springer. ISBN 0306110288. .
[4] John von Neumann (1996). The mathematical foundations of quantum mechanics; Volume 2 in Princeton Landmarks in Mathematics series
(http://books. google.com/books ?id=JLyCo3R04qUC&printsec=frontcover&dq=mathematical+foundations+of+quantum+mechanics+
inauthor:von+inauthor:neumann&lr=&as_drrb_is^
cd=l#v=onepage&q=&f=false) (Reprint of translation of original 1932 ed.). Princeton University Press. ISBN 0691028931. .
[5] E. Brian Davies, quoted on the King's College London analysis group website "Research at the analysis group" (http://www.kcl.ac.uk/
schools/pse/maths/research/analysis/research.html). .
[6] Nicholas Young (1988). An introduction to Hilbert space (http://books. google. com/books ?id=_igwFHKwcyYC&pg=PA3). Cambridge
University Press, p. 3. ISBN 0521337178. .
[7] Jean-Luc Dorier (2000). On the teaching of linear algebra; Vol. 23 of Mathematics education library (http://books.google.com/
books ?id=gqZUGMKtNuoC&pg=PA50&dq="thinking+geometa^
as_miny_is=&as_maxm_is=0&as_maxy_is=&as_brr=0&cd=l#v=onepage&q="thinking geometrically in Hilbert's "&f=false). Springer.
ISBN 0792365399. .
Spectral theory
183
[8] Cf. Spectra in mathematics and in physics (http://www.dm.unito.it/personalpages/capietto/Spectra.pdf) by Jean Mawhin, p.4 and pp.
10-11.
[9] Edgar Raymond Lorch (2003). Spectral Theory (http://books. google.com/books ?id=X3U2AAAACAAJ&dq=intitle:spectral+
intitle:theory+inauthor:Lorch&lr=&as_drrb_is=q&as_minm_is=0&as_miny_is=&as_maxm_is=0&as_maxy_is=&
(Reprint of Oxford 1962 ed.). Textbook Publishers, p. 89. ISBN 0758171560. .
[10] Nicholas Young, op. cit (http://books.google.com/books ?id=_igwFHKwcyYC&pg=PA81). p. 81. ISBN 0521337178. .
[11] Helmut H. Schaefer, Manfred P. H. Wolff (1999). Topological vector spaces (http://books.google.com/books ?id=9kXY742pABoC&
pg=PA36) (2nd ed.). Springer, p. 36. ISBN 0387987266. .
[12] Dmitrii Petrovich Zhelobenko (2006). Principal structures and methods of representation theory (http://books.google.com/
books?id=3TkmvZktjp8C&pg=PA24). American Mathematical Society. ISBN 0821837311. .
[13] Edgar Raymond Lorch (2003). "Chapter III: Hilbert Space" (http://books.google.com/books ?id=X3U2AAAACAAJ&
dq=intitle : spectral+intitle : theory +im
as_brr=0&cd=l). op. cit.. p. 57. ISBN 0758171560. .
[14] Edgar Raymond Lorch (2003). "Chapter V: The Structure of Self-Adjoint Transformations" (http://books.google.com/
books ?id=X3U2AAAACAAJ&dq=intitle:spectral+intitle:theory
as_maxm_is=0&as_maxy_is=&as_brr=0&cd=l). op. cit.. p. 106 ff. ISBN 0758171560. .
[15] Bernard Friedman (1990). Principles and Techniques of Applied Mathematics (http://books. google.com/books ?id=gnQeAQAAIAAJ&
dq=intitle : applied+intitle : mathemati
as_maxy_is=&as_brr=0&cd=l) (Reprint of 1956 Wiley ed.). Dover Publications, p. 26. ISBN 0486664449. .
[16] PAM Dirac (1981). The principles of quantum mechanics (http://books. google. com/books ?id=XehUpGiM6FIC&pg=PA29) (4rth ed.).
Oxford University Press, p. 29 ff. ISBN 0198520115. .
[17] Jtirgen Audretsch (2007). "Chapter 1.1.2: Linear operators on the Hilbert space" (http://books. google.com/books ?id=8NxIgwAOU6IC&
pg=PA5). Entangled systems: new directions in quantum physics. Wiley-VCH. p. 5. ISBN 3527406840. .
[18] R. A. Howland (2006). Intermediate dynamics: a linear algebraic approach (http://books. google.com/books ?id=SepP8-W3M0AC&
pg=PA69&dq=dyad+representation+operator&lr=&as_drrb_is=q&as
as_brr=0&cd=32#v=onepage&q=dyad representation operator&f=false) (2nd ed.). Birkhauser. p. 69 ff. ISBN 0387280596. .
[19] Bernard Friedman (1990). "Chapter 2: Spectral theory of operators" (http://books. google. com/books ?id=gnQeAQAAIAAJ&
dq=intitle : applied+intitle : mathemati
as_maxy_is=&as_brr=0&cd=l). op. cit.. p. 57. ISBN 0486664449. .
[20] See discussion in Dirac's book referred to above, and Milan Vujicic (2008). Linear algebra thoroughly explained (http://books. google.
com/books 7id=pifStNLaXGkC&pg=PA274). Springer, p. 274. ISBN 3540746374. .
[21] See, for example, the fundamental text of John von Neumann, op. cit (http://books. google.com/books ?id=JLyCo3R04qUC&
printsec=frontcover). ISBN 0691028931. . and Arch W. Naylor, George R. Sell (2000). Linear Operator Theory in Engineering and Science;
Vol. 40 of Applied mathematical science (http://books. google. com/books ?id=t3SXs4-KrE0C&pg=PA401). Springer, p. 401.
ISBN 038795001X. ., Steven Roman (2008). Advanced linear algebra (http://books.google.com/books 7id=bSyQr-wUys8C&pg=PA233)
(3rd ed.). Springer. ISBN 0387728287. ., iDuDrii Makarovich Berezanskii (1968). Expansions in eigenfunctions of self adjoint operators; Vol.
17 in Translations of mathematical monographs (http://books. google. com/books ?id=OPPWBE3WQqkC&pg=PA3 17). American
Mathematical Society. ISBN 0821815679. .
[22] See for example, Gerald B Folland (2009). "Convergence and completeness" (http://books. google. com/books ?id=idAomhpwI8MC&
pg=PA77). Fourier Analysis and its Applications (Reprint of Wadsworth & Brooks/Cole 1992 ed.). American Mathematical Society, pp. 77 ff.
ISBN 0821847902. .
[23] PAM Dirac. op. cit (http://books.google.com/books?id=XehUpGiM6FIC&pg=PA65#v=onepage&q=&f=false). p. 65 ff.
ISBN 0198520115. .
[24] PAM Dirac. op. cit (http://books.google.com/books?id=XehUpGiM6FIC&pg=PA60#v=onepage&q=&f=false). p. 60 ff.
ISBN 0198520115. .
[25] Bernard Friedman, op. cit (http://books. google. com/books ?id=gnQeAQAAIAAJ&dq=intitle:applied+intitle:mathem
inauthor:Friedman&lr=&as_drrb_is=q&as_minmJ^ p. 214, Eq.
2.14. ISBN 0486664449. .
[26] For example, see Sadri Hassani (1999). "Chapter 20: Green's functions in one dimension" (http://books.google.com/
books ?id=BCMLOp6DyFIC&pg=RAl-PA553). Mathematical physics: a modern introduction to its foundations . Springer, p. 553 et seq.
ISBN 0387985794. . and Qing-Hua Qin (2007). Green's function and boundary elements of multifield materials (http://books.google.com/
books ?id=UUfy8CcJiDkC&printsec=frontcover). Elsevier. ISBN 0080451349. .
Spectral theory
184
General references
• Hazewinkel, Michiel, ed. (2001), "Spectral theory of linear operators" (http://eom. springer. de/S/s086520.
htm), Encyclopaedia of Mathematics, Springer, ISBN 978-1556080104
• Nelson Dunford; Jacob T Schwartz (1988). Linear Operators, Spectral Theory, Self Adjoint Operators in Hilbert
Space (Part 2) (http://books.google.com/books ?id=eOFfQQAACAAJ&dq=isbn:0471608475&cd=l)
(Paperback reprint of 1967 ed.). Wiley. ISBN 0471608475.
• Nelson Dunford; Jacob T Schwartz (1988). Linear Operators, Spectral Operators (Part 3) (http: //books. google.
conVbooks?id=B0SeJNIh3BwC&printsec=frontcover&dq=isbn:0471608467&cd=l#v=onepage&q=&
(Paperback reprint of 1971 ed.). Wiley. ISBN 0471608467.
• Sadri Hassani (1999). "Chapter 4: Spectral decomposition" (http://books.google.com/
books ?id=BCMLOp6DyFIC&pg=RAl-PA109#v=onepage&q=&f=false). Mathematical physics: a modern
introduction to its foundations . Springer. ISBN 0387985794.
• Edward Brian Davies (1996). Spectral Theory and Differential Operators; Volume 42 in the Cambridge Studies
in Advanced Mathematics (http: //books. google. com/books ?id=EXtKu J AksSUC&printsec=frontcover&
dq=intitle : Spectral+intitle : Theory +intitle : and+intitle : Diff erential+intitle : Operators&lr=&as_drrb_is=q&
as_minm_is=0&as_miny_is=&as_maxm_is=0&as_maxy_is=&as3rr=0&cd=l#v=onepage&q=Spectral
theory &f=false). Cambridge University Press. ISBN 0521587107.
• Arch W. Naylor, George R. Sell (2000). "Chapter 5, Part B: The Spectrum" (http://books.google.com/
books?id=t3SXs4-KrE0C&pg=PA4 1 1 &dq="resolution+of+the+identity "&lr=&as_drrb_is=q&
as_minm_is=0&as_miny_is=&as_maxm_is=0&as of
the identity "&f=false). Linear Operator Theory in Engineering and Science; Volume 40 of Applied
mathematical sciences. Springer, p. 411. ISBN 038795001X.
• Shmuel Kantorovitz (1983). Spectral Theory ofBanach Space Operators;. Springer.
External links
• Evans M. Harrell II (http://www.mathphysics.com/opthy/OpHistory.html): A Short History of Operator
Theory
• Gregory H. Moore (1995). "The axiomatization of linear algebra: 1875-1940". Historia Mathematica 22:
262-303. doi: 10. 1006/hmat. 1995. 1025.
185
Quantum Field Theory, SUSY, Quantum
Geometry and Quantum Algebraic Topology
Quantum electrodynamics
Quantum electrodynamics (QED) is the relativistic quantum field theory of electrodynamics. In essence, it
describes how light and matter interact and is the first theory where full agreement between quantum mechanics and
special relativity is achieved. QED mathematically describes all phenomena involving electrically charged particles
interacting by means of exchange of photons and represents the quantum counterpart of classical electrodynamics
giving a complete account of matter and light interaction. One of the founding fathers of QED, Richard Feynman,
has called it "the jewel of physics" for its extremely accurate predictions of quantities like the anomalous magnetic
moment of the electron, and the Lamb shift of the energy levels of hydrogen
In technical terms, QED can be described as a perturbation theory of the electromagnetic quantum vacuum.
History
The first formulation of a quantum theory describing radiation and matter interaction is due to Paul Adrien Maurice
Dirac, who, during 1920, was first able to compute the coefficient of spontaneous emission of an atom. 1
Dirac described the quantization of the electromagnetic field as an ensemble of harmonic
oscillators with the introduction of the concept of creation and annihilation operators of
particles. In the following years, with contributions from Wolfgang Pauli, Eugene
Wigner, Pascual Jordan, Werner Heisenberg and an elegant formulation of quantum
electrodynamics due to Enrico Fermi, 1 physicists came to believe that, in principle, it
would be possible to perform any computation for any physical process involving
photons and charged particles. However, further studies by Felix Bloch with Arnold
Nordsieck,^ and Victor Weisskopf,^ in 1937 and 1939, revealed that such
computations were reliable only at a first order of perturbation theory, a problem already
pointed out by Robert OppenheimerJ 6 ^ At higher orders in the series infinities emerged,
making such computations meaningless and casting serious doubts on the internal consistency of the theory itself.
With no solution for this problem known at the time, it appeared that a fundamental incompatibility existed between
special relativity and quantum mechanics .
Difficulties with the theory increased through the end of 1940. Improvements in
microwave technology made it possible to take more precise measurements of the shift
of the levels of a hydrogen atom, now known as the Lamb shift and magnetic moment
T81
of the electron. These experiments unequivocally exposed discrepancies which the
theory was unable to explain.
A first indication of a possible way out was given by Hans Bethe. In 1947, while he was
rm
traveling by train to reach Schenectady from New York, after giving a talk at the
conference at Shelter Island on the subject, Bethe completed the first non-relativistic
computation of the shift of the lines of the hydrogen atom as measured by Lamb and
Hans Bethe
Quantum electrodynamics
186
RetherfordJ 10 ^ Despite the limitations of the computation, agreement was excellent. The idea was simply to attach
infinities to corrections at mass and charge that were actually fixed to a finite value by experiments. In this way, the
infinities get absorbed in those constants and yield a finite result in good agreement with experiments. This
procedure was named renormalization.
Based on Bethe's intuition and fundamental
papers on the subject by Sin-Itiro
TomonagaJ 11 ^ Julian Schwinger/ 12 ^ ^
Richard Feynman^ 14 ^ ^ 15 ^ ^ and Freeman
Dyson [17] [18] , it was finally possible to get
fully covariant formulations that were finite
at any order in a perturbation series of
quantum electrodynamics. Sin-Itiro
Tomonaga, Julian Schwinger and Richard
Feynman were jointly awarded with a Nobel
prize in physics in 1965 for their work in
this areaJ 19 ^ Their contributions, and those
of Freeman Dyson, were about covariant
and gauge invariant formulations of
quantum electrodynamics that allow
computations of observables at any order of
perturbation theory. Feynman' s
mathematical technique, based on his
diagrams, initially seemed very different
from the field- theoretic, operator-based
approach of Schwinger and Tomonaga, but
Freeman Dyson later showed that the two
ri7i
approaches were equivalent.
Renormalization, the need to attach a
physical meaning at certain divergences
appearing in the theory through integrals,
has subsequently become one of the
fundamental aspects of quantum field theory
and has come to be seen as a criterion for a theory's general acceptability. Even though renormalization works very
well in practice, Feynman was never entirely comfortable with its mathematical validity, even referring to
renormalization as a "shell game" and "hocus pocus".^
Shelter Island Conference group photo (Courtesy of Archives, National Academy
of Sciences).
Feynman (center) and Oppenheimer (left)
at Los Alamos.
QED has served as the model and template for all subsequent quantum field theories. One such subsequent theory is
quantum chromodynamics, which began in the early 1960s and attained its present form in the 1975 work by H.
David Politzer, Sidney Coleman, David Gross and Frank Wilczek. Building on the pioneering work of Schwinger,
Gerald Guralnik, Dick Hagen, and Tom Kibble, Peter Higgs, Jeffrey Goldstone, and others, Sheldon
Glashow, Steven Weinberg and Abdus Salam independently showed how the weak nuclear force and quantum
electrodynamics could be merged into a single electroweak force.
Quantum electrodynamics
187
Feynman's view of quantum electrodynamics
Introduction
Near the end of his life, Richard P. Feynman gave a series of lectures on QED intended for the lay public. These
lectures were transcribed and published as Feynman (1985), QED: The strange theory of light and matter} 1 ^ ^ a
classic non-mathematical exposition of QED from the point of view articulated below.
The key components of Feynman's presentation of QED are three basic actions.
• A photon goes from one place and time to another place and time.
• An electron goes from one place and time to another place and time.
• An electron emits or absorbs a photon at a certain place and time.
These actions are represented in a form of
visual shorthand by the three basic elements
of Feynman diagrams: a wavy line for the
photon, a straight line for the electron and a
junction of two straight lines and a wavy
one for a vertex representing emission or
absorption of a photon by an electron. These
may all be seen in the adjacent diagram.
Time
B
D
A
PHOTON
ELECTRON
PHOTON EMISSION OR ABSORPTION
Space
Feynman diagram elements
It is important not to over-interpret these
diagrams. Nothing is implied about how a
particle gets from one point to another. The
diagrams do not imply that the particles are
moving in straight or curved lines. They do
not imply that the particles are moving with
fixed speeds. The fact that the photon is often represented, by convention, by a wavy line and not a straight one does
not imply that it is thought that it is more wavelike than is an electron. The images are just symbols to represent the
actions above: photons and electrons do, somehow, move from point to point and electrons, somehow, emit and
absorb photons. We do not know how these things happen, but the theory tells us about the probabilities of these
things happening. Trajectory is a meaningless concept in quantum mechanics.
As well as the visual shorthand for the actions Feynman introduces another kind of shorthand for the numerical
quantities which tell us about the probabilities. If a photon moves from one place and time - in shorthand, A - to
another place and time — shorthand, B - the associated quantity is written in Feynman's shorthand as P(A to B). The
similar quantity for an electron moving from C to D is written E(C to D). The quantity which tells us about the
probability for the emission or absorption of a photon he calls 'j'. This is related to, but not the same as, the measured
electron charge 'e'.
QED is based on the assumption that complex interactions of many electrons and photons can be represented by
fitting together a suitable collection of the above three building blocks, and then using the probability-quantities to
calculate the probability of any such complex interaction. It turns out that the basic idea of QED can be
communicated while making the assumption that the quantities mentioned above are just our everyday probabilities.
(A simplification of Feynman's book.) Later on this will be corrected to include specifically quantum mathematics,
following Feynman.
The basic rules of probabilities that will be used are that a) if an event can happen in a variety of different ways then
its probability is the sum of the probabilities of the possible ways and b) if a process involves a number of
independent subprocesses then its probability is the product of the component probabilities.
Quantum electrodynamics
188
Basic constructions
Suppose we start with one electron at a certain place and time (this place and time being given the arbitrary label A)
and a photon at another place and time (given the label B). A typical question from a physical standpoint is: 'What is
the probability of finding an electron at C (another place and a later time) and a photon at D (yet another place and
time)?'. The simplest process to achieve this end is for the electron to move from A to C (an elementary action) and
that the photon moves from B to D (another elementary action). From a knowledge of the probabilities of each of
these subprocesses - E(A to C) and P(B to D) - then we would expect to calculate the probability of both happening
by multiplying them, using rule b) above. This gives a simple estimated answer to our question.
But there are other ways in which the end result could come about.
The electron might move to a place and time E where it absorbs
the photon; then move on before emitting another photon at F;
then move on to C where it is detected, while the new photon
moves on to D. The probability of this complex process can again
be calculated by knowing the probabilities of each of the
individual actions: three electron actions, two photon actions and
two vertexes - one emission and one absorption. We would expect
to find the total probability by multiplying the probabilities of each
of the actions, for any chosen positions of E and F. We then, using
rule a) above, have to add up all these probabilities for all the
alternatives for E and F. (This is not elementary in practice, and
involves integration.) But there is another possibility: that is that the electron first moves to G where it emits a
photon which goes on to D, while the electron moves on to H, where it absorbs the first photon, before moving on to
C. Again we can calculate the probability of these possibilities (for all points G and H). We then have a better
estimation for the total probability by adding the probabilities of these two possibilities to our original simple
estimate. Incidentally the name given to this process of a photon interacting with an electron in this way is Compton
Scattering.
Compton scattering
There are an infinite number of other intermediate processes in which more and more photons are absorbed and/or
emitted. For each of these possibilities there is a Feynman diagram describing it. This implies a complex
computation for the resulting probabilities, but provided it is the case that the more complicated the diagram the less
it contributes to the result, it is only a matter of time and effort to find as accurate an answer as one wants to the
original question. This is the basic approach of QED. To calculate the probability of any interactive process between
electrons and photons it is a matter of first noting, with Feynman diagrams, all the possible ways in which the
process can be constructed from the three basic elements. Each diagram involves some calculation involving definite
rules to find the associated probability.
That basic scaffolding remains when one moves to a quantum description but some conceptual changes are
requested. One is that whereas we might expect in our everyday life that there would be some constraints on the
points to which a particle can move, that is not true in full quantum electrodynamics. There is a certain possibility of
an electron or photon at A moving as a basic action to any other place and time in the universe. That includes places
that could only be reached at speeds greater than that of light and also earlier times. (An electron moving backwards
in time can be viewed as a positron moving forward in time.)
Quantum electrodynamics
189
Probability amplitudes
Addition of probability amplitudes as complex
numbers
Quantum mechanics introduces an important change on the way
probabilities are computed. It has been found that the quantities
which we have to use to represent the probabilities are not the
usual real numbers we use for probabilities in our everyday world,
but complex numbers which are called probability amplitudes.
Feynman avoids exposing the reader to the mathematics of
complex numbers by using a simple but accurate representation of
them as arrows on a piece of paper or screen. (These must not be
confused with the arrows of Feynman diagrams which are actually
simplified representations in two dimensions of a relationship
between points in three dimensions of space and one of time.) The
amplitude-arrows are fundamental to the description of the world given by quantum theory. No satisfactory reason
has been given for why they are needed. But pragmatically we have to accept that they are an essential part of our
description of all quantum phenomena. They are related to our everyday ideas of probability by the simple rule that
the probability of an event is the square of the length of the corresponding amplitude-arrow. So, for a given process,
if two probability amplitudes, v and w, are involved, the probability of the process will be given either by
P = |v + w| 2
or
p = | v x w| 2 -
The rules as regards adding or multiplying, however, are the same as above. But where you would expect to add or
multiply probabilities, instead you add or multiply probability amplitudes that now are complex numbers.
Addition and multiplication are familiar operations in the theory of
complex numbers and are given in the figures. The sum is found as
follows. Let the start of the second arrow be at the end of the first.
The sum is then a third arrow that goes directly from the start of
the first to the end of the second. The product of two arrows is an
arrow whose length is the product of the two lengths. The
direction of the product is found by adding the angles that each of
the two have been turned through relative to a reference direction:
that gives the angle that the product is turned relative to the
reference direction.
That change, from probabilities to probability amplitudes,
complicates the mathematics without changing the basic approach. But that change is still not quite enough because
it fails to take into account the fact that both photons and electrons can be polarized, which is to say that their
orientation in space and time have to be taken into account. Therefore P(A to B) actually consists of 16 complex
numbers, or probability amplitude arrows. There are also some minor changes to do with the quantity "j", which may
have to be rotated by a multiple of 90° for some polarizations, which is only of interest for the detailed bookkeeping.
Associated with the fact that the electron can be polarized is another small necessary detail which is connected with
the fact that an electron is a Fermion and obeys Fermi-Dirac statistics. The basic rule is that if we have the
probability amplitude for a given complex process involving more than one electron, then when we include (as we
always must) the complementary Feynman diagram in which we just exchange two electron events, the resulting
amplitude is the reverse — the negative — of the first. The simplest case would be two electrons starting at A and B
ending at C and D. The amplitude would be calculated as the "difference", E(A to B)xE(C to D) — E(A to C)xE(B to
D), where we would expect, from our everyday idea of probabilities, that it would be a sum.
VxW
iR
w
V
R o
1
Multiplication of probability amplitudes as complex
numbers
Quantum electrodynamics
190
Propagators
Finally, one has to compute P(A to B) and E (C to D) corresponding to the probability amplitudes for the photon and
the electron respectively. These are essentially the solutions of the Dirac Equation which describes the behavior of
the electron's probability amplitude and the Klein-Gordon equation which describes the behavior of the photon's
probability amplitude. These are called Feynman propagators. The translation to a notation commonly used in the
standard literature is as follows:
P(A to B) -> D F (x B - x A ), E(C to D) -> S F (x D - x c )
where a shorthand symbol such as Xj± stands for the four real numbers which give the time and position in three
dimensions of the point labeled A.
Mass renormalization
A problem arose historically which held up progress for twenty
years: although we start with the assumption of three basic
"simple" actions, the rules of the game say that if we want to
calculate the probability amplitude for an electron to get from A to
B we must take into account all the possible ways: all possible
Feynman diagrams with those end points. Thus there will be a way
in which the electron travels to C, emits a photon there and then
absorbs it again at D before moving on to B. Or it could do this
kind of thing twice, or more. In short we have a fractal-like
situation in which if we look closely at a line it breaks up into a
collection of "simple" lines, each of which, if looked at closely, are
in turn composed of "simple" lines, and so on ad infinitum. This is
a very difficult situation to handle. If adding that detail only
altered things slightly then it would not have been too bad, but
disaster struck when it was found that the simple correction mentioned above led to infinite probability amplitudes.
In time this problem was "fixed" by the technique of renormalization (see below and the article on mass
renormalization). However, Feynman himself remained unhappy about it, calling it a "dippy process" P 0 ^
Electron self-energy loop
Conclusions
Within the above framework physicists were then able to calculate to a high degree of accuracy some of the
properties of electrons, such as the anomalous magnetic dipole moment. However, as Feynman points out, it fails
totally to explain why particles such as the electron have the masses they do. "There is no theory that adequately
explains these numbers. We use the numbers in all our theories, but we don't understand them — what they are, or
where they come from. I believe that from a fundamental point of view, this is a very interesting and serious
problem. "^
Quantum electrodynamics
191
Mathematics
Mathematically, QED is an abelian gauge theory with the symmetry group U(l). The gauge field, which mediates
the interaction between the charged spin- 1/2 fields, is the electromagnetic field. The QED Lagrangian for a spin- 1/2
field interacting with the electromagnetic field is given by the real part of
c = v^7 m £>m - - ^ V .
where
7^ are Dirac matrices;
ij; a bispinor field of spin- 1/2 particles (e.g. electron-positron field);
= i/^7o> called "psi-bar", is sometimes referred to as Dirac adjoint;
Dp — dp + icAp + ieB^is the gauge covariant derivative;
e is the coupling constant, equal to the electric charge of the bispinor field;
is the covariant four-potential of the electromagnetic field generated by the electron itself;
Bp is the external field imposed by external source;
— d^Av — d^A^is the electromagnetic field tensor.
Equations of motion
To begin, substituting the definition of D into the Lagrangian gives us:
£ = ijrfW ~ ehM* + - m W> ~ iff***"-
Next, we can substitute this Lagrangian into the Euler-Lagrange equation of motion for a field:
to find the field equations for QED.
The two terms from this Lagrangian are then:
— = -efa^A* + Bfl ) -
Substituting these two back into the Euler-Lagrange equation (2) results in:
iByfrf + e^(A^ + B^)+m^ = 0
with complex conjugate:
i-fdyft - ej^A* 1 + B^ip - rmj> = 0.
Bringing the middle term to the right-hand side transforms this second equation into:
The left-hand side is like the original Dirac equation and the right-hand side is the interaction with the
electromagnetic field.
One further important equation can be found by substituting the Lagrangian into another Euler-Lagrange equation,
this time for the field, J^ 1 :
Quantum electrodynamics
192
The two terms this time are:
dC
— = -e+f*
and these two terms, when substituted back into (3) give us:
Interaction picture
This theory can be straightforwardly quantized treating bosonic and fermionic sectors as free. This permits to build a
set of asymptotic states to start a computation of the probability amplitudes for different processes. In order to be
able to do so, we have to compute an evolution operator that, for a given initial state, will give a final state in such a
way to have
M fi = (f\U\i).
This technique is also known as the S-Matrix. Evolution operator is obtained in the interaction picture where time
evolution is given by the interaction Hamiltonian. So, from equations above is
V = e J Sx^^A^
and so, one has
U = Texp I--*- f dt f V{t')
L h J to
being T the time ordering operator. This evolution operator has only a meaning as a series and what we get here is a
perturbation series with a development parameter being fine structure constant. This series is named Dyson series.
Feynman diagrams
Despite the conceptual clarity of this Feynman approach to QED, almost no textbooks follow him in their
presentation. When performing calculations it is much easier to work with the Fourier transforms of the propagators.
Quantum physics considers particle's momenta rather than their positions, and it is convenient to think of particles as
being created or annihilated when they interact. Feynman diagrams then look the same, but the lines have different
interpretations. The electron line represents an electron with a given energy and momentum, with a similar
interpretation of the photon line. A vertex diagram represents the annihilation of one electron and the creation of
another together with the absorption or creation of a photon, each having specified energies and momenta.
Using Wick theorem on the terms of the Dyson series, all the terms of the S-matrix for quantum electrodynamics can
be computed through the technique of Feynman diagrams. In this case rules for drawing are the following
Quantum electrodynamics
193
a fr- /3
P 2 + 2£
-ze7^(27r) 4 ^(p 1+ p 2 +p3)^
Incoming antifermion: a • -» s)
Outgoing fermion: • ^ a -> u a (p,s)
Outgoing antifermion: • a -> '^a(p>s)
Incoming photon: ^ *wx^v« -> 6^ (ft, A)
Outgoing photon: r\-^\-^ ft e^(fc, A)*
To these rules we must add a further one for closed loops that implies an integration on momenta J cftp / (2tt)^ •
From them, computations of probability amplitudes are straightforwardly given. An example is Compton scattering,
with an electron and a photon undergoing elastic scattering. Feynman diagrams are in this case
Quantum electrodynamics
194
and so we are able to get the corresponding amplitude at the first order of a perturbation series for S -matrix:
M fi = (ie) 2 u(p',sW(k'A')
( p + fc )2_ m 2
from which we are able to compute the cross section for this scattering.
(p — k f ) 2 — m\
Renormalizability
Higher order terms can be straightforwardly computed for the evolution operator but these terms display diagrams
containing the following simpler ones
One-loop contribution to the
vacuum polarization function
n
One-loop contribution to the electron
self-energy function Yj
One-loop
contribution
to the vertex
function P
that, being closed loops, imply the presence of diverging integrals having no mathematical meaning. To overcome
this difficulty, a technique like renormalization has been devised, producing finite results in very close agreement
with experiments. It is important to note that a criterion for theory being meaningful after renormalization is that the
number of diverging diagrams is finite. In this case the theory is said renormalizable. The reason for this is that to
get observables renormalized one needs a finite number of constants to maintain the predictive value of the theory
untouched. This is exactly the case of quantum electrodynamics displaying just three diverging diagrams. This
procedure gives observables in very close agreement with experiment as seen e.g. for electron gyromagnetic ratio.
Renormalizability has become an essential criterion for a quantum field theory to be considered as a viable one. All
the theories describing fundamental interactions, except gravitation whose quantum counterpart is presently under
very active research, are renormalizable theories.
Quantum electrodynamics
195
Nonconvergence of series
An argument by Freeman Dyson shows that the radius of convergence of the perturbation series in QED is zeroJ 24 ^
The basic argument goes as follows: if the coupling constant were negative, this would be equivalent to the Coulomb
force constant being negative. This would "reverse" the electromagnetic interaction so that like charges would attract
and unlike charges would repel. This would render the vacuum unstable against decay into a cluster of electrons on
one side of the universe and a cluster of positrons on the other side of the universe. Because the theory is sick for any
negative value of the coupling constant, the series do not converge, but are an asymptotic series. This can be taken as
a need for a new theory, a problem with perturbation theory, or ignored by taking a "shut-up-and-calculate"
approach.
References
[I] Feynman, Richard (1985). "Chapter 1". QED: The Strange Theory of Light and Matter. Princeton University Press, p. 6.
ISBN 978-0691125756.
[2] P.A.M. Dirac (1927). "The Quantum Theory of the Emission and Absorption of Radiation". Proceedings of the Royal Society of London A
114: 243-265. doi:10.1098/rspa.l927.0039.
[3] E. Fermi (1932). "Quantum Theory of Radiation". Reviews of Modern Physics 4: 87-132. doi:10.1103/RevModPhys.4.87.
[4] F. Bloch; A. Nordsieck (1937). "Note on the Radiation Field of the Electron". Physical Review 52: 54-59. doi:10.1103/PhysRev.52.54.
[5] V. F. Weisskopf (1939). "On the Self-Energy and the Electromagnetic Field of the Electron". Physical Review 56: 72-85.
doi:10.1103/PhysRev.56.72.
[6] R. Oppenheimer (1930). "Note on the Theory of the Interaction of Field and Matter". Physical Review 35: 461-477.
doi: 1 0. 1 1 03/Phy sRev. 35 .46 1 .
[7] W. E. Lamb; R. C. Retherford (1947). "Fine Structure of the Hydrogen Atom by a Microwave Method,". Physical Review 72: 241-243.
doi: 10. 1 103/PhysRev.72.241 .
[8] P. Kusch; H. M. Foley (1948). "On the Intrinsic Momement of the Electron,". Physical Review 73: 412. doi:10.1103/PhysRev.74.250.
[9] Schweber, Silvan (1994). "Chapter 5". QED and the Men Who Did it: Dyson, Feynman, Schwinger, and Tomonaga. Princeton University
Press, p. 230. ISBN 978-0691033273.
[10] H. Bethe (1947). "The Electromagnetic Shift of Energy Levels". Physical Review 72: 339-341. doi:10.1103/PhysRev.72.339.
[II] S. Tomonaga (1946). "On a Relativistically Invariant Formulation of the Quantum Theory of Wave Fields". Progress of Theoretical Physics
1: 27-42. doi:10.1143/PTP.1.27.
[12] J. Schwinger (1948). "On Quantum-Electrodynamics and the Magnetic Moment of the Electron". Physical Review 73: 416-417.
doi: 1 0. 1 1 03/Phy sRev.73 .4 1 6.
[13] J. Schwinger (1948). "Quantum Electrodynamics. I. A Covariant Formulation". Physical Review 74: 1439-1461.
doi: 10. 1 1 03/PhysRev.74. 1439.
[14] R. P. Feynman (1949). "Space-Time Approach to Quantum Electrodynamics". Physical Review 76: 769-789. doi:10.1103/PhysRev.76.769.
[15] R. P. Feynman (1949). "The Theory of Positrons". Physical Review 76: 749-759. doi:10.1103/PhysRev.76.749.
[16] R. P. Feynman (1950). "Mathematical Formulation of the Quantum Theory of Electromagnetic Interaction". Physical Review 80: 440-457.
doi: 1 0. 1 1 03/Phy sRev. 80.440.
[17] F. Dyson (1949). "The Radiation Theories of Tomonaga, Schwinger, and Feynman". Physical Review 75: 486-502.
doi: 1 0. 1 1 03/Phy sRev.75 .486.
[18] F. Dyson (1949). "The S Matrix in Quantum Electrodynamics". Physical Review 75: 1736-1755. doi:10.1103/PhysRev.75.1736.
[19] "The Nobel Prize in Physics 1965" (http://nobelprize.org/nobel_prizes/physics/laureates/1965/index.html). Nobel Foundation. .
Retrieved 2008-10-09.
[20] Feynman, Richard (1985). QED: The Strange Theory of Light and Matter. Princeton University Press, p. 128. ISBN 978-0691 125756.
[21] G.S. Guralnik, C.R. Hagen, T.W.B. Kibble (1964). "Global Conservation Laws and Massless Particles". Physical Review Letters 13:
585-587. doi:10.1103/PhysRevLett.l3.585.
[22] G.S. Guralnik (2009). "The History of the Guralnik, Hagen and Kibble development of the Theory of Spontaneous Symmetry Breaking and
Gauge Particles". International Journal of Modern Physics A 24: 2601-2627. doi: 10. 1142/S021775 1X09045431. arXiv:0907.3466.
[23] Feynman, Richard (1985). QED: The Strange Theory of Light and Matter. Princeton University Press, p. 152. ISBN 978-0691 125756.
[24] Kinoshita, Toichiro. "Quantum Electrodynamics has Zero Radius of Convergence Summarized from [[Toichiro Kinoshita (http://www.
lassp.cornell.edu/sethna/Cracks/QED.html)]"]. . Retrieved 06-10-2010.
Quantum electrodynamics
196
Further reading
Books
• De Broglie, Louis (1925). Recherches sur la theorie des quanta [Research on quantum theory]. France:
Wiley-Interscience.
• Feynman, Richard Phillips (1998). Quantum Electrodynamics. Westview Press; New Ed edition.
ISBN 978-0201360752.
• Jauch, J.M.; Rohrlich, F. (1980). The Theory of Photons and Electrons. Springer- Verlag. ISBN 978-0387072951.
• Greiner, Walter; Bromley, D.A.,Muller, Berndt. (2000). Gauge Theory of Weak Interactions. Springer.
ISBN 978-3540676720.
• Kane, Gordon, L. (1993). Modern Elementary Particle Physics. Westview Press. ISBN 978-0201624601.
• Miller, Arthur I. (1995). Early Quantum Electrodynamics : A Sourcebook. Cambridge University Press.
ISBN 978-0521568913.
• Milonni, Peter W., (1994) The quantum vacuum - an introduction to quantum electrodynamics. Academic Press.
ISBN 0-12-498080-5
• Schweber, Silvian, S. (1994). QED and the Men Who Made It. Princeton University Press.
ISBN 978-0691033273.
• Schwinger, Julian (1958). Selected Papers on Quantum Electrodynamics. Dover Publications.
ISBN 978-0486604442.
• Tannoudji-Cohen, Claude; Dupont-Roc, Jacques, and Grynberg, Gilbert (1997). Photons and Atoms: Introduction
to Quantum Electrodynamics. Wiley-Interscience. ISBN 978-0471184331.
Journals
• Dudley, J.M., and Kwan, A.M. (1996) "Richard Feynman's popular lectures on quantum electrodynamics: The
1979 Robb Lectures at Auckland University," American Journal of Physics 64: 694-698.
External links
• Feynman's Nobel Prize lecture describing the evolution of QED and his role in it (http://nobelprize.org/physics/
laureates/1965/feynman-lecture.html)
• Feynman's New Zealand lectures on QED for non-physicists (http://www.vega.org.Uk/video/subseries/8)
Quantum field theory
197
Quantum field theory
Quantum field theory (QFT)^ provides a theoretical framework for constructing quantum mechanical models of
systems classically parametrized (represented) by an infinite number of dynamical degrees of freedom, that is, fields
and (in a condensed matter context) many-body systems. It is the natural and quantitative language of particle
physics and condensed matter physics. Most theories in modern particle physics, including the Standard Model of
elementary particles and their interactions, are formulated as relativistic quantum field theories. Quantum field
theories are used in many contexts, elementary particle physics being the most vital example, where the particle
count/number going into a reaction fluctuates and changes, differing from the count/number going out, for example,
and for the description of critical phenomena and quantum phase transitions, such as in the BCS theory of
superconductivity, also see phase transition, quantum phase transition, critical phenomena. Quantum field theory is
thought by many to be the unique and correct outcome of combining the rules of quantum mechanics with special
relativity.
In perturbative quantum field theory, the forces between particles are mediated by other particles. The
electromagnetic force between two electrons is caused by an exchange of photons. Intermediate vector bosons
mediate the weak force and gluons mediate the strong force. There is currently no complete quantum theory of the
remaining fundamental force, gravity, but many of the proposed theories postulate the existence of a graviton
particle that mediates it. These force-carrying particles are virtual particles and, by definition, cannot be detected
while carrying the force, because such detection will imply that the force is not being carried. In addition, the notion
of "force mediating particle" comes from perturbation theory, and thus does not make sense in a context of bound
states.
In QFT photons are not thought of as 'little billiard balls', they are considered to be field quanta - necessarily
chunked ripples in a field, or "excitations," that 'look like' particles. Fermions, like the electron, can also be described
as ripples/excitations in a field, where each kind of fermion has its own field. In summary, the classical visualisation
of "everything is particles and fields," in quantum field theory, resolves into "everything is particles," which then
resolves into "everything is fields." In the end, particles are regarded as excited states of a field (field quanta). The
gravitational field and the electromagnetic field are the only two fundamental fields in Nature that have infinite range
and a corresponding classical low-energy limit, which greatly diminishes and hides their "particle-like" excitations.
Albert Einstein, in 1905, attributed "particle-like" and discrete exchanges of momenta and energy, characteristic of
"field quanta," to the electromagnetic field. Originally, his principal motivation was to explain the thermodynamics
of radiation. Although it is often claimed that the photoelectric and Compton effects require a quantum description of
the EM field, this is now understood to be untrue, and proper proof of the quantum nature of radiation is now taken
up into modern quantum optics as in the antibunching effect . The word "photon" was coined in 1926 by the great
physical chemist Gilbert Newton Lewis (see also the articles photon antibunching and laser).
The "low-energy limit" of the correct quantum field-theoretic description of the electromagnetic field, quantum
electrodynamics, is believed to become James Clerk Maxwell's 1864 theory, although the "classical limit" of
quantum electrodynamics has not been as widely explored as that of quantum mechanics. Presumably, the as yet
unknown correct quantum field-theoretic treatment of the gravitational field will become and "look exactly like"
Einstein's general theory of relativity in the "low-energy limit." Indeed, quantum field theory itself is quite possibly
the low-energy-effective-field-theory limit of a more fundamental theory such as superstring theory. Compare in this
context the article effective field theory.
Quantum field theory
198
History
Quantum field theory originated in the 1920s from the problem of creating a quantum mechanical theory of the
electromagnetic field. In 1925 Werner Heisenberg, Max Born, and Pascual Jordan constructed such a theory by
expressing the field's internal degrees of freedom as an infinite set of harmonic oscillators and by employing the
usual procedure for quantizing those oscillators. This theory assumed that no electric charges or currents were
present, and today would be called a free field theory. The first reasonably complete theory of quantum
electrodynamics, which included both the electromagnetic field and electrically charged matter (specifically,
electrons) as quantum mechanical objects, was created by Paul Dirac in 1927. This quantum field theory could be
used to model important processes such as the emission of a photon by an electron dropping into a quantum state of
lower energy, a process in which the number of particles changes — one atom in the initial state becomes an atom
plus a photon in the final state. It is now understood that the ability to describe such processes is one of the most
important features of quantum field theory.
It was evident from the beginning that a proper quantum treatment of the electromagnetic field had to somehow
incorporate Einstein's relativity theory, which had after all grown out of the study of classical electromagnetism. This
need to put together relativity and quantum mechanics was the second major motivation in the development of
quantum field theory. Pascual Jordan and Wolfgang Pauli showed in 1928 that quantum fields could be made to
behave in the way predicted by special relativity during coordinate transformations (specifically, they showed that
the field commutators were Lorentz invariant). A further boost for quantum field theory came with the discovery of
the Dirac equation, which was originally formulated and interpreted as a single-particle equation analogous to the
Schrodinger equation, but unlike the Schrodinger equation, the Dirac equation satisfies both Lorentz invariance, that
is, the requirements of special relativity, and the rules of quantum mechanics. The Dirac equation accommodated the
spin- 1/2 value of the electron and accounted for its magnetic moment as well as giving accurate predictions for the
spectra of hydrogen. The attempted interpretation of the Dirac equation as a single-particle equation could not be
maintained long, however, and finally it was shown that several of its undesirable properties (such as
negative-energy states) could be made sense of by reformulating and reinterpreting the Dirac equation as a true field
equation, in this case for the quantized "Dirac field" or the "electron field", with the "negative-energy solutions"
pointing to the existence of anti -particles. This work was performed first by Dirac himself with the invention of hole
theory 1930 and also by Wendell Furry, Robert Oppenheimer, Vladimir Fock, and others. Schrodinger, during the
same period that he discovered his famous equation in 1926, also independently found the relativistic generalization
of it known as the Klein-Gordon equation but dismissed it since, without spin, it predicted impossible properties for
the hydrogen spectrum. See Oskar Klein, Walter Gordon. All relativistic wave equations that describe spin-zero
particles are said to be of the Klein-Gordon type.
A subtle and careful analysis in 1933 and later in 1950 by Niels Bohr and Leon Rosenfeld showed that there is a
fundamental limitation on the ability to simultaneously measure the electric and magnetic field strengths that enter
into the description of charges in interaction with radiation, imposed by the uncertainty principle, which must apply
to all canonically conjugate quantities. This limitation is crucial for the successful formulation and interpretation of a
quantum field theory of photons and electrons(quantum electrodynamics), and indeed,any perturbative quantum field
theory. The analysis of Bohr and Rosenfeld explains fluctuations in the values of the electromagnetic field that differ
from the classically "allowed" values distant from the sources of the field. Their analysis was crucial to showing that
the limitations and physical implications of the uncertainty principle apply to all dynamical systems, whether fields
or material particles. Their analysis also convinced most people that any notion of returning to a fundamental
description of nature based on classical field theory, such as what Einstein aimed at with his numerous and failed
attempts at a classical unified field theory, was simply out of the question.
The third thread in the development of quantum field theory was the need to handle the statistics of many-particle
systems consistently and with ease. In 1927 Jordan tried to extend the canonical quantization of fields to the
many-body wave functions of identical particles, a procedure that is sometimes called second quantization. In 1928,
Quantum field theory
199
Jordan and Eugene Wigner found that the quantum field describing electrons, or other fermions, had to be expanded
using anti-commuting creation and annihilation operators due to the Pauli exclusion principle. This thread of
development was incorporated into many-body theory and strongly influenced condensed matter physics and nuclear
physics.
Despite its early successes quantum field theory was plagued by several serious theoretical difficulties. Basic
physical quantities, such as the self-energy of the electron, the energy shift of electron states due to the presence of
the electromagnetic field, gave infinite, divergent contributions — a nonsensical result — when computed using the
perturbative techniques available in the 1930s and most of the 1940s. The electron self-energy problem was already
a serious issue in the classical electromagnetic field theory, where the attempt to attribute to the electron a finite size
or extent (the classical electron-radius) led immediately to the question of what non-electromagnetic stresses would
need to be invoked, which would presumably hold the electron together against the Coulomb repulsion of its
finite-sized "parts". The situation was dire, and had certain features that reminded many of the "Ray leigh- Jeans
difficulty". What made the situation in the 1940s so desperate and gloomy, however, was the fact that the correct
ingredients (the second-quantized Maxwell-Dirac field equations) for the theoretical description of interacting
photons and electrons were well in place, and no major conceptual change was needed analogous to that which was
necessitated by a finite and physically sensible account of the radiative behavior of hot objects, as provided by the
Planck radiation law.
This "divergence problem" was solved in the case of quantum electrodynmaics during the late 1940s and early 1950s
by Hans Bethe, Tomonaga, Schwinger, Feynman, and Dyson, through the procedure known as renormalization.
Great progress was made after realizing that ALL infinities in quantum electrodynamics are related to two effects:
the self-energy of the electron/positron and vacuum polarization. Renormalization concerns the business of paying
very careful attention to just what is meant by, for example, the very concepts "charge" and "mass" as they occur in
the pure, non-interacting field-equations. The "vacuum" is itself polarizable and, hence, populated by virtual particle
(on shell and off shell) pairs, and, hence, is a seething and busy dynamical system in its own right. This was a critical
step in identifying the source of "infinities" and "divergences". The "bare mass" and the "bare charge" of a particle,
the values that appear in the free-field equations (non-interacting case), are abstractions that are simply not realized
in experiment (in interaction). What we measure, and hence, what we must take account of with our equations, and
what the solutions must account for, are the "renormalized mass" and the "renormalized charge" of a particle. That is
to say, the "shifted" or "dressed" values these quantities must have when due care is taken to include all deviations
from their "bare values" is dictated by the very nature of quantum fields themselves.
The first approach that bore fruit is known as the "interaction representation," (see the article Interaction picture) a
Lorentz covariant and gauge-invariant generalization of time-dependent perturbation theory used in ordinary
quantum mechanics, and developed by Tomonaga and Schwinger, generalizing earlier efforts of Dirac, Fock and
Podolsky. Tomonaga and Schwinger invented a relativistically covariant scheme for representing field commutators
and field operators intermediate between the two main representations of a quantum system, the Schrodinger and the
Heisenberg representations (see the article on quantum mechanics). Within this scheme, field commutators at
separated points can be evaluated in terms of "bare" field creation and annihilation operators. This allows for keeping
track of the time-evolution of both the "bare" and "renormalized", or perturbed, values of the Hamiltonian and
expresses everything in terms of the coupled, gauge invariant "bare" field-equations. Schwinger gave the most
elegant formulation of this approach. The next and most famous development is due to Feynman, who, with his
brilliant rules for assigning a "graph"/" diagram" to the terms in the scattering matrix (See S-Matrix Feynman
diagrams). These directly corresponded (through the Schwinger-Dyson equation) to the measurable physical
processes (cross sections, probability amplitudes, decay widths and lifetimes of excited states) one needs to be able
to calculate. This revolutionized how quantum field theory calculations are carried-out in practice.
Two classic text-books from the 1960s, J.D. Bjorken and S.D. Drell, Relativistic Quantum Mechanics (1964) and J.J.
Sakurai, Advanced Quantum Mechanics (1967), thoroughly developed the Feynman graph expansion techniques
Quantum field theory
200
using physically intuitive and practical methods following from the correspondence principle, without worrying
about the technicalities involved in deriving the Feynman rules from the superstructure of quantum field theory
itself. Although both Feynman' s heuristic and pictorial style of dealing with the infinities, as well as the formal
methods of Tomonaga and Schwinger, worked extremely well, and gave spectacularly accurate answers, the true
analytical nature of the question of "renormalizability", that is, whether ANY theory formulated as a "quantum field
theory" would give finite answers, was not worked-out till much later, when the urgency of trying to formulate finite
theories for the strong and electro-weak (and gravitational interactions) demanded its solution.
Renormalization in the case of QED was largely fortuitous due to the smallness of the coupling constant, the fact that
the coupling has no dimensions involving mass, the so-called fine structure constant, and also the zero-mass of the
gauge boson involved, the photon, rendered the small-distance/high-energy behavior of QED manageable. Also,
electromagnetic processess are very "clean" in the sense that they are not badly suppressed/damped and/or hidden by
the other gauge interactions. By 1958 Sidney Drell observed: "Quantum electrodynamics (QED) has achieved a
status of peaceful coexistence with its divergences...."
The unification of the electromagnetic force with the weak force encountered initial difficulties due to the lack of
accelerator energies high enough to reveal processes beyond the Fermi interaction range. Additionally, a satisfactory
theoretical understanding of hadron substructure had to be developed, culminating in the quark model.
In the case of the strong interactions, progress concerning their short-distance/high-energy behavior was much
slower and more frustrating. For strong interactions with the electro- weak fields, there were difficult issues regarding
the strength of coupling, the mass generation of the force carriers as well as their non-linear, self interactions.
Although there has been theoretical progress toward a grand unified quantum field theory incorporating the
electro-magnetic force, the weak force and the strong force, empirical verification is still pending. Superunification,
incorporating the gravitational force, is still very speculative, and is under intensive investigation by many of the best
minds in contemporary theoretical physics. Gravitation is a tensor field description of a spin-2 gauge-boson, the
"graviton", and is further discussed in the articles on general relativity and quantum gravity.
From the point of view of the techniques of (four-dimensional) quantum field theory, and as the numerous and heroic
efforts to formulate a consistent quantum gravity theory by some very able minds attests, gravitational quantization
was, and is still, the reigning champion for bad behavior. There are problems and frustrations stemming from the fact
that the gravitational coupling constant has dimensions involving inverse powers of mass, and as a simple
consequence, it is plagued by badly behaved (in the sense of perturbation theory) non-linear and violent
self-interactions. Gravity, basically, gravitates, which in turn... gravitates... and so on, (i.e., gravity is itself a source of
gravity,...,) thus creating a nightmare at all orders of perturbation theory. Also, gravity couples to all energy equally
strongly, as per the equivalence principle, so this makes the notion of ever really "switching-off", "cutting-off" or
separating, the gravitational interaction from other interactions ambiguous and impossible since, with gravitation, we
are dealing with the very structure of space-time itself. (See general covariance and, for a modest, yet highly
non-trivial and significant interplay between (QFT) and gravitation (spacetime), see the article Hawking radiation
and references cited therein. Also quantum field theory in curved spacetime).
Thanks to the somewhat brute-force, clanky and heuristic methods of Feynman, and the elegant and abstract methods
of Tomonaga/Schwinger, from the period of early renormalization, we do have the modern theory of quantum
electrodynamics (QED). It is still the most accurate physical theory known, the prototype of a successful quantum
field theory. Beginning in the 1950s with the work of Yang and Mills, as well as Ryoyu Utiyama, following the
previous lead of Weyl and Pauli, deep explorations illuminated the types of symmetries and in variances any field
theory must satisfy. QED, and indeed, all field theories, were generalized to a class of quantum field theories known
as gauge theories. Quantum electrodynamics is the most famous example of what is known as an Abelian gauge
theory. It relies on the symmetry group U(l) and has one massless gauge field, the U(l) gauge symmetry, dictating
the form of the interactions involving the electromagnetic field, with the photon being the gauge boson. That
symmetries dictate, limit and necessitate the form of interaction between particles is the essence of the "gauge theory
Quantum field theory
201
revolution." Yang and Mills formulated the first explicit example of a non-Abelian gauge theory, Yang-Mills theory,
with an attempted explanation of the strong interactions in mind. The strong interactions were then (incorrectly)
understood in the mid-1950s, to be mediated by the pi-mesons, the particles predicted by Hideki Yukawa in 1935,
based on his profound reflections concerning the reciprocal connection between the mass of any force-mediating
particle and the range of the force it mediates. This was allowed by the uncertainty principle. The 1960s and 1970s
saw the formulation of a gauge theory now known as the Standard Model of particle physics, which systematically
describes the elementary particles and the interactions between them.
The electroweak interaction part of the standard model was formulated by Sheldon Glashow in the years 1958-60
with his discovery of the SU(2)xU(l) group structure of the theory. Steven Weinberg and Abdus Salam brilliantly
invoked the Anderson-Higgs mechanism for the generation of the W's and Z masses (the intermediate vector
boson(s) responsible for the weak interactions and neutral-currents) and keeping the mass of the photon zero. The
Goldstone/Higgs idea for generating mass in gauge theories was sparked in the late 1950s and early 1960s when a
number of theoreticians (including Yoichiro Nambu, Steven Weinberg, Jeffrey Goldstone, Frangois Englert, Robert
Brout, G. S. Guralnik, C. R. Hagen, Tom Kibble and Philip Warren Anderson) noticed a possibly useful analogy to
the (spontaneous) breaking of the U(l) symmetry of electromagnetism in the formation of the BCS ground-state of a
superconductor. The gauge boson involved in this situation, the photon, behaves as though it has acquired a finite
mass. There is a further possibility that the physical vacuum (ground- state) does not respect the symmetries implied
by the "unbroken" electroweak Lagrangian (see the article Electroweak interaction for more details) from which one
arrives at the field equations. The electroweak theory of Weinberg and Salam was shown to be renormalizable
(finite) and hence consistent by Gerardus 't Hooft and Martinus Veltman. The Glashow- Weinberg-Salam theory
(GWS-Theory) is a triumph and, in certain applications, gives an accuracy on a par with quantum electrodynamics.
Also during the 1970s parallel developments in the study of phase transitions in condensed matter physics led Leo
Kadanoff, Michael Fisher and Kenneth Wilson (extending work of Ernst Stueckelberg, Andre Peterman, Murray
Gell-Mann, and Francis Low) to a set of ideas and methods known as the renormalization group. By providing a
better physical understanding of the renormalization procedure invented in the 1940s, the renormalization group
sparked what has been called the "grand synthesis" of theoretical physics, uniting the quantum field theoretical
techniques used in particle physics and condensed matter physics into a single theoretical framework.
The study of quantum field theory is alive and flourishing, as are applications of this method to many physical
problems. It remains one of the most vital areas of theoretical physics today, providing a common language to many
branches of physics.
Principles of quantum field theory
Classical fields and quantum fields
Quantum mechanics, in its most general formulation, is a theory of abstract operators (observables) acting on an
abstract state space (Hilbert space), where the observables represent physically observable quantities and the state
space represents the possible states of the system under study. Furthermore, each observable corresponds, in a
technical sense, to the classical idea of a degree of freedom. For instance, the fundamental observables associated
with the motion of a single quantum mechanical particle are the position and momentum operators x an d p .
Ordinary quantum mechanics deals with systems such as this, which possess a small set of degrees of freedom.
(It is important to note, at this point, that this article does not use the word "particle" in the context of wave-particle
duality. In quantum field theory, "particle" is a generic term for any discrete quantum mechanical entity, such as an
electron or photon, which can behave like classical particles or classical waves under different experimental
conditions.)
A quantum field is a quantum mechanical system containing a large, and possibly infinite, number of degrees of
freedom. A classical field contains a set of degrees of freedom at each point of space; for instance, the classical
Quantum field theory
202
electromagnetic field defines two vectors — the electric field and the magnetic field — that can in principle take on
distinct values for each position r . When the field as a whole is considered as a quantum mechanical system, its
observables form an infinite (in fact uncountable) set, because r is continuous.
Furthermore, the degrees of freedom in a quantum field are arranged in "repeated" sets. For example, the degrees of
freedom in an electromagnetic field can be grouped according to the position r , with exactly two vectors for each
r . Note that r is an ordinary number that "indexes" the observables; it is not to be confused with the position
operator x encountered in ordinary quantum mechanics, which is an observable. (Thus, ordinary quantum
mechanics is sometimes referred to as "zero-dimensional quantum field theory", because it contains only a single set
of observables.)
It is also important to note that there is nothing special about r because, as it turns out, there is generally more than
one way of indexing the degrees of freedom in the field.
In the following sections, we will show how these ideas can be used to construct a quantum mechanical theory with
the desired properties. We will begin by discussing single-particle quantum mechanics and the associated theory of
many-particle quantum mechanics. Then, by finding a way to index the degrees of freedom in the many-particle
problem, we will construct a quantum field and study its implications.
Single-particle and many-particle quantum mechanics
In quantum mechanics, the time-dependent Schrodinger equation for a single particle is
We wish to consider how this problem generalizes to 7VP ar tid es - There are two motivations for studying the
many-particle problem. The first is a straightforward need in condensed matter physics, where typically the number
23
of particles is on the order of Avogadro's number (6.0221415 x 10 ). The second motivation for the many-particle
problem arises from particle physics and the desire to incorporate the effects of special relativity. If one attempts to
include the relativistic rest energy into the above equation (in quantum mechanics where position is an observable),
the result is either the Klein-Gordon equation or the Dirac equation. However, these equations have many
unsatisfactory qualities; for instance, they possess energy eigenvalues that extend to — °o, so that there seems to be no
easy definition of a ground state. It turns out that such inconsistencies arise from relativistic wavefunctions having a
probabilistic interpretation in position space, as probability conservation is not a relativistically covariant concept. In
quantum field theory, unlike in quantum mechanics, position is not an observable, and thus, one does not need the
concept of a position-space probability density. For quantum fields whose interaction can be treated perturbatively,
this is equivalent to neglecting the possibility of dynamically creating or destroying particles, which is a crucial
aspect of relativistic quantum theory. Einstein's famous mass-energy relation allows for the possibility that
sufficiently massive particles can decay into several lighter particles, and sufficiently energetic particles can combine
to form massive particles. For example, an electron and a positron can annihilate each other to create photons. This
suggests that a consistent relativistic quantum theory should be able to describe many -particle dynamics.
Furthermore, we will assume that the TV particles are indistinguishable. As described in the article on identical
particles, this implies that the state of the entire system must be either symmetric (bosons) or antisymmetric
(fermions) when the coordinates of its constituent particles are exchanged. These multi-particle states are rather
complicated to write. For example, the general quantum state of a system of TV bosons is written as
over all possible permutations p acting on elements. In general, this is a sum of 7V!( TV factorial) distinct
v P eS N
where are the single-particle states, Njis the number of particles occupying state j , and the sum is taken
Quantum field theory
203
terms, which quickly becomes unmanageable as TV mcre ases. The way to simplify this problem is to turn it into a
quantum field theory.
Second quantization
In this section, we will describe a method for constructing a quantum field theory called second quantization. This
basically involves choosing a way to index the quantum mechanical degrees of freedom in the space of multiple
identical-particle states. It is based on the Hamiltonian formulation of quantum mechanics; several other approaches
Mi
exist, such as the Feynman path integral, which uses a Lagrangian formulation. For an overview, see the article on
quantization.
Second quantization of bosons
For simplicity, we will first discuss second quantization for bosons, which form perfectly symmetric quantum states.
Let us denote the mutually orthogonal single-particle states by |02)j 1 03 ) , and so on. For example, the
3-particle state with one particle in state and two in state |0 2 ) * s
^ [\M<h)\<h) + |0 2 >|0i>|0 2 > + \<h)\<h)\<f>i)} ■
The first step in second quantization is to express such quantum states in terms of occupation numbers, by listing
the number of particles occupying each of the single-particle states |02} 5 etc - This is simply another way of
labelling the states. For instance, the above 3-particle state is denoted as
|1,2,0,0,0,...).
The next step is to expand the TV -particle state space to include the state spaces for all possible values of TV • This
extended state space, known as a Fock space, is composed of the state space of a system with no particles (the
so-called vacuum state), plus the state space of a 1 -particle system, plus the state space of a 2-particle system, and so
forth. It is easy to see that there is a one-to-one correspondence between the occupation number representation and
valid boson states in the Fock space.
At this point, the quantum mechanical system has become a quantum field in the sense we described above. The
field's elementary degrees of freedom are the occupation numbers, and each occupation number is indexed by a
number j • • • , indicating which of the single-particle states |0i) , |02) j ■ ■ ■ " " " ^ refers to.
The properties of this quantum field can be explored by defining creation and annihilation operators, which add and
subtract particles. They are analogous to "ladder operators" in the quantum harmonic oscillator problem, which
added and subtracted energy quanta. However, these operators literally create and annihilate particles of a given
quantum state. The bosonic annihilation operator <22and creation operator a\ nave me following effects:
oalJVi, N 2 , N 3 ,...) = Jn~ 2 \ N u (N 2 - 1), N 3 , . . .},
4\N U N 2 , N 3 , ...) = yfN 2 + l | N u (N 2 + 1), N 3 , . . .).
It can be shown that these are operators in the usual quantum mechanical sense, i.e. linear operators acting on the
Fock space. Furthermore, they are indeed Hermitian conjugates, which justifies the way we have written them. They
can be shown to obey the commutation relation
[a h a 5 ] = 0 , [aj, a]] = 0 , [a*, a]] = S ij:
where S stands for the Kronecker delta. These are precisely the relations obeyed by the ladder operators for an
infinite set of independent quantum harmonic oscillators, one for each single-particle state. Adding or removing
bosons from each state is therefore analogous to exciting or de-exciting a quantum of energy in a harmonic
oscillator.
The Hamiltonian of the quantum field (which, through the Schrodinger equation, determines its dynamics) can be
written in terms of creation and annihilation operators. For instance, the Hamiltonian of a field of free
Quantum field theory
204
(non-interacting) bosons is
k
where E k is the energy of the -th single-particle energy eigenstate. Note that
ata k \...,N k) ...) = N k \...,N k ,...).
Second quantization of fermions
It turns out that a different definition of creation and annihilation must be used for describing fermions. According to
the Pauli exclusion principle, fermions cannot share quantum states, so their occupation numbers N{ can only take
on the value 0 or 1. The fermionic annihilation operators c and creation operators c t are defined by their actions on
a Fock state thus
c j \N 1 ,N 2 ,...,N j = 0,...) = 0
Cj \N u N 2 , ...,Nj = l,...) = (-l)( JVl+ - +JV -^|JV 1) N 2 , ...,Nj = 0,.. .}
c}\N u N 2 , . . . , Nj = 0, . . .) = (-1)^ + -+ N ^\N U N 2 , . . . , = 1, . . .)
4\N u N 2 ,...,N j = l,...)=0.
These obey an anticommutation relation:
{c i ,c j } = {) , {cj, C t} = 0 , {c i ,ct}=^-.
One may notice from this that applying a fermionic creation operator twice gives zero, so it is impossible for the
particles to share single-particle states, in accordance with the exclusion principle.
Field operators
We have previously mentioned that there can be more than one way of indexing the degrees of freedom in a quantum
field. Second quantization indexes the field by enumerating the single-particle quantum states. However, as we have
discussed, it is more natural to think about a "field", such as the electromagnetic field, as a set of degrees of freedom
indexed by position.
To this end, we can define field operators that create or destroy a particle at a particular point in space. In particle
physics, these operators turn out to be more convenient to work with, because they make it easier to formulate
theories that satisfy the demands of relativity.
Single-particle states are usually enumerated in terms of their momenta (as in the particle in a box problem.) We can
construct field operators by applying the Fourier transform to the creation and annihilation operators for these states.
For example, the bosonic field annihilation operator 0(r)is
3
The bosonic field operators obey the commutation relation
[0(r),0(r')]=O , [4>Hr),^(r')]=0 , [0(r), 0+(r')] = S 3 (r - r')
where 8{x) stands for the Dirac delta function. As before, the fermionic relations are the same, with the
commutators replaced by anticommutators.
It should be emphasized that the field operator is not the same thing as a single-particle wavefunction. The former is
an operator acting on the Fock space, and the latter is a quantum-mechanical amplitude for finding a particle in some
position. However, they are closely related, and are indeed commonly denoted with the same symbol. If we have a
Hamiltonian with a space representation, say
Quantum field theory
205
where the indices % and j run over all particles, then the field theory Hamiltonian (in the non-relativistic limit and
for negligible self-interactions) is
This looks remarkably like an expression for the expectation value of the energy, with 0 playing the role of the
wavefunction. This relationship between the field operators and wavefunctions makes it very easy to formulate field
theories starting from space-projected Hamiltonians.
Implications of quantum field theory
Unification of fields and particles
The "second quantization" procedure that we have outlined in the previous section takes a set of single-particle
quantum states as a starting point. Sometimes, it is impossible to define such single-particle states, and one must
proceed directly to quantum field theory. For example, a quantum theory of the electromagnetic field must be a
quantum field theory, because it is impossible (for various reasons) to define a wavefunction for a single photon. In
such situations, the quantum field theory can be constructed by examining the mechanical properties of the classical
field and guessing the corresponding quantum theory. For free (non-interacting) quantum fields, the quantum field
theories obtained in this way have the same properties as those obtained using second quantization, such as
well-defined creation and annihilation operators obeying commutation or anticommutation relations.
Quantum field theory thus provides a unified framework for describing "field-like" objects (such as the
electromagnetic field, whose excitations are photons) and "particle-like" objects (such as electrons, which are treated
as excitations of an underlying electron field), so long as one can treat interactions as "perturbations" of free fields.
There are still unsolved problems relating to the more general case of interacting fields that may or may not be
adequately described by perturbation theory. For more on this topic, see Haag's theorem.
Physical meaning of particle indistinguishability
The second quantization procedure relies crucially on the particles being identical. We would not have been able to
construct a quantum field theory from a distinguishable many-particle system, because there would have been no
way of separating and indexing the degrees of freedom.
Many physicists prefer to take the converse interpretation, which is that quantum field theory explains what identical
particles are. In ordinary quantum mechanics, there is not much theoretical motivation for using symmetric
(bosonic) or antisymmetric (fermionic) states, and the need for such states is simply regarded as an empirical fact.
From the point of view of quantum field theory, particles are identical if and only if they are excitations of the same
underlying quantum field. Thus, the question "why are all electrons identical?" arises from mistakenly regarding
individual electrons as fundamental objects, when in fact it is only the electron field that is fundamental.
Particle conservation and non-conservation
During second quantization, we started with a Hamiltonian and state space describing a fixed number of particles (
TVX an d ended with a Hamiltonian and state space for an arbitrary number of particles. Of course, in many common
situations TV is an important and perfectly well-defined quantity, e.g. if we are describing a gas of atoms sealed in a
box. From the point of view of quantum field theory, such situations are described by quantum states that are
eigenstates of the number operator ]\f , which measures the total number of particles present. As with any quantum
mechanical observable, yyis conserved if it commutes with the Hamiltonian. In that case, the quantum state is
trapped in the TV -particle subspace of the total Fock space, and the situation could equally well be described by
ordinary TV -particle quantum mechanics. (Strictly speaking, this is only true in the noninteracting case or in the low
energy density limit of renormalized quantum field theories)
H=-^- / d\ 0t(r)V 2 0(r) + d 3 r / dV ${t)<P{t?)U{\t - r'|)0(r')0(r).
Quantum field theory
206
For example, we can see that the free-boson Hamiltonian described above conserves particle number. Whenever the
Hamiltonian operates on a state, each particle destroyed by an annihilation operator a>k is immediately put back by
the creation operator a\. •
On the other hand, it is possible, and indeed common, to encounter quantum states that are not eigenstates of ]\f ,
which do not have well-defined particle numbers. Such states are difficult or impossible to handle using ordinary
quantum mechanics, but they can be easily described in quantum field theory as quantum superpositions of states
having different values of TV . For example, suppose we have a bosonic field whose particles can be created or
destroyed by interactions with a fermionic field. The Hamiltonian of the combined system would be given by the
Hamiltonians of the free boson and free fermion fields, plus a "potential energy" term such as
H i = £ V q (a q + al q )c\ +q c k ,
k,q
where a\. an d a k denotes the bosonic creation and annihilation operators, c£ an d c k denotes the fermionic
creation and annihilation operators, and V^is a parameter that describes the strength of the interaction. This
"interaction term" describes processes in which a fermion in state fa either absorbs or emits a boson, thereby being
kicked into a different eigenstate k + q . (In fact, this type of Hamiltonian is used to describe interaction between
conduction electrons and phonons in metals. The interaction between electrons and photons is treated in a similar
way, but is a little more complicated because the role of spin must be taken into account.) One thing to notice here is
that even if we start out with a fixed number of bosons, we will typically end up with a superposition of states with
different numbers of bosons at later times. The number of fermions, however, is conserved in this case.
In condensed matter physics, states with ill-defined particle numbers are particularly important for describing the
various superfluids. Many of the defining characteristics of a superfluid arise from the notion that its quantum state is
a superposition of states with different particle numbers. In addition, the concept of a coherent state (used to model
the laser and the BCS ground state) refers to a state with an ill-defined particle number but a well-defined phase.
Axiomatic approaches
The preceding description of quantum field theory follows the spirit in which most physicists approach the subject.
However, it is not mathematically rigorous. Over the past several decades, there have been many attempts to put
quantum field theory on a firm mathematical footing by formulating a set of axioms for it. These attempts fall into
two broad classes.
The first class of axioms, first proposed during the 1950s, include the Wightman, Osterwalder-Schrader, and
Haag-Kastler systems. They attempted to formalize the physicists' notion of an "operator- valued field" within the
context of functional analysis, and enjoyed limited success. It was possible to prove that any quantum field theory
satisfying these axioms satisfied certain general theorems, such as the spin- statistics theorem and the CPT theorem.
Unfortunately, it proved extraordinarily difficult to show that any realistic field theory, including the Standard
Model, satisfied these axioms. Most of the theories that could be treated with these analytic axioms were physically
trivial, being restricted to low-dimensions and lacking interesting dynamics. The construction of theories satisfying
one of these sets of axioms falls in the field of constructive quantum field theory. Important work was done in this
area in the 1970s by Segal, Glimm, Jaffe and others.
During the 1980s, a second set of axioms based on geometric ideas was proposed. This line of investigation, which
restricts its attention to a particular class of quantum field theories known as topological quantum field theories, is
associated most closely with Michael Atiyah and Graeme Segal, and was notably expanded upon by Edward Witten,
Richard Borcherds, and Maxim Kontsevich. However, most of the physically relevant quantum field theories, such
as the Standard Model, are not topological quantum field theories; the quantum field theory of the fractional
quantum Hall effect is a notable exception. The main impact of axiomatic topological quantum field theory has been
on mathematics, with important applications in representation theory, algebraic topology, and differential geometry.
Quantum field theory
207
Finding the proper axioms for quantum field theory is still an open and difficult problem in mathematics. One of the
Millennium Prize Problems — proving the existence of a mass gap in Yang-Mills theory — is linked to this issue.
Phenomena associated with quantum field theory
In the previous part of the article, we described the most general properties of quantum field theories. Some of the
quantum field theories studied in various fields of theoretical physics possess additional special properties, such as
renormalizability, gauge symmetry, and super symmetry. These are described in the following sections.
Renormalization
Early in the history of quantum field theory, it was found that many seemingly innocuous calculations, such as the
perturbative shift in the energy of an electron due to the presence of the electromagnetic field, give infinite results.
The reason is that the perturbation theory for the shift in an energy involves a sum over all other energy levels, and
there are infinitely many levels at short distances that each give a finite contribution.
Many of these problems are related to failures in classical electrodynamics that were identified but unsolved in the
19th century, and they basically stem from the fact that many of the supposedly "intrinsic" properties of an electron
are tied to the electromagnetic field that it carries around with it. The energy carried by a single electron — its self
energy — is not simply the bare value, but also includes the energy contained in its electromagnetic field, its
attendant cloud of photons. The energy in a field of a spherical source diverges in both classical and quantum
mechanics, but as discovered by Weisskopf, in quantum mechanics the divergence is much milder, going only as the
logarithm of the radius of the sphere.
The solution to the problem, presciently suggested by Stueckelberg, independently by Bethe after the crucial
experiment by Lamb, implemented at one loop by Schwinger, and systematically extended to all loops by Feynman
and Dyson, with converging work by Tomonaga in isolated postwar Japan, is called renormalization. The technique
of renormalization recognizes that the problem is essentially purely mathematical, that extremely short distances are
at fault. In order to define a theory on a continuum, first place a cutoff on the fields, by postulating that quanta
cannot have energies above some extremely high value. This has the effect of replacing continuous space by a
structure where very short wavelengths do not exist, as on a lattice. Lattices break rotational symmetry, and one of
the crucial contributions made by Feynman, Pauli and Villars, and modernized by 't Hooft and Veltman, is a
symmetry preserving cutoff for perturbation theory. There is no known symmetrical cutoff outside of perturbation
theory, so for rigorous or numerical work people often use an actual lattice.
On a lattice, every quantity is finite but depends on the spacing. When taking the limit of zero spacing, we make sure
that the physically observable quantities like the observed electron mass stay fixed, which means that the constants
in the Lagrangian defining the theory depend on the spacing. Hopefully, by allowing the constants to vary with the
lattice spacing, all the results at long distances become insensitive to the lattice, defining a continuum limit.
The renormalization procedure only works for a certain class of quantum field theories, called renormalizable
quantum field theories. A theory is perturbatively renormalizable when the constants in the Lagrangian only
diverge at worst as logarithms of the lattice spacing for very short spacings. The continuum limit is then well defined
in perturbation theory, and even if it is not fully well defined non-perturbatively, the problems only show up at
distance scales that are exponentially small in the inverse coupling for weak couplings. The Standard Model of
particle physics is perturbatively renormalizable, and so are its component theories (quantum
electrodynamics/electro weak theory and quantum chromodynamics). Of the three components, quantum
electrodynamics is believed to not have a continuum limit, while the asymptotically free SU(2) and SU(3) weak
hypercharge and strong color interactions are nonperturbatively well defined.
The renormalization group describes how renormalizable theories emerge as the long distance low-energy effective
field theory for any given high-energy theory. Because of this, renormalizable theories are insensitive to the precise
Quantum field theory
208
nature of the underlying high-energy short-distance phenomena. This is a blessing because it allows physicists to
formulate low energy theories without knowing the details of high energy phenomenon. It is also a curse, because
once a renormalizable theory like the standard model is found to work, it gives very few clues to higher energy
processes. The only way high energy processes can be seen in the standard model is when they allow otherwise
forbidden events, or if they predict quantitative relations between the coupling constants.
Gauge freedom
A gauge theory is a theory that admits a symmetry with a local parameter. For example, in every quantum theory the
global phase of the wave function is arbitrary and does not represent something physical. Consequently, the theory is
invariant under a global change of phases (adding a constant to the phase of all wave functions, everywhere); this is a
global symmetry. In quantum electrodynamics, the theory is also invariant under a local change of phase, that is -
one may shift the phase of all wave functions so that the shift may be different at every point in space-time. This is a
local symmetry. However, in order for a well-defined derivative operator to exist, one must introduce a new field,
the gauge field, which also transforms in order for the local change of variables (the phase in our example) not to
affect the derivative. In quantum electrodynamics this gauge field is the electromagnetic field. The change of local
gauge of variables is termed gauge transformation.
In quantum field theory the excitations of fields represent particles. The particle associated with excitations of the
gauge field is the gauge boson, which is the photon in the case of quantum electrodynamics.
The degrees of freedom in quantum field theory are local fluctuations of the fields. The existence of a gauge
symmetry reduces the number of degrees of freedom, simply because some fluctuations of the fields can be
transformed to zero by gauge transformations, so they are equivalent to having no fluctuations at all, and they
therefore have no physical meaning. Such fluctuations are usually called "non-physical degrees of freedom" or gauge
artifacts; usually some of them have a negative norm, making them inadequate for a consistent theory. Therefore, if
a classical field theory has a gauge symmetry, then its quantized version (i.e. the corresponding quantum field
theory) will have this symmetry as well. In other words, a gauge symmetry cannot have a quantum anomaly. If a
gauge symmetry is anomalous (i.e. not kept in the quantum theory) then the theory is non-consistent: for example, in
quantum electrodynamics, had there been a gauge anomaly, this would require the appearance of photons with
longitudinal polarization and polarization in the time direction, the latter having a negative norm, rendering the
theory inconsistent; another possibility would be for these photons to appear only in intermediate processes but not
in the final products of any interaction, making the theory non unitary and again inconsistent (see optical theorem).
In general, the gauge transformations of a theory consist of several different transformations, which may not be
commutative. These transformations are together described by a mathematical object known as a gauge group.
Infinitesimal gauge transformations are the gauge group generators. Therefore the number of gauge bosons is the
group dimension (i.e. number of generators forming a basis).
All the fundamental interactions in nature are described by gauge theories. These are:
• Quantum electrodynamics, whose gauge transformation is a local change of phase, so that the gauge group is
U(l). The gauge boson is the photon.
• Quantum chromodynamics, whose gauge group is SU(3). The gauge bosons are eight gluons.
• The electroweak Theory, whose gauge group is U(l) ® SU (2) (a direct product of U(l) and SU(2)).
• Gravity, whose classical theory is general relativity, admits the equivalence principle, which is a form of gauge
symmetry. However, it is explicitly non-renormalizable.
Quantum field theory
209
Multivalued Gauge Transformations
The gauge transformations which leave the theory invariant involve by definition only single- valued gauge functions
functions which violate the integrability criterion. These are capable of changing the physical field strengths and are
therefore no proper symmetry transformations. Nevertheless, the transformed field equations describe correctly the
physical laws in the presence of the newly generated field strengths. See the textbook by H. Kleinert cited below for
the applications to phenomena in physics.
Supersymmetry
Supersymmetry assumes that every fundamental fermion has a superpartner that is a boson and vice versa. It was
introduced in order to solve the so-called Hierarchy Problem, that is, to explain why particles not protected by any
symmetry (like the Higgs boson) do not receive radiative corrections to its mass driving it to the larger scales (GUT,
Planck...). It was soon realized that supersymmetry has other interesting properties: its gauged version is an
extension of general relativity (Supergravity), and it is a key ingredient for the consistency of string theory.
The way supersymmetry protects the hierarchies is the following: since for every particle there is a superpartner with
the same mass, any loop in a radiative correction is cancelled by the loop corresponding to its superpartner,
rendering the theory UV finite.
Since no superpartners have yet been observed, if supersymmetry exists it must be broken (through a so-called soft
term, which breaks supersymmetry without ruining its helpful features). The simplest models of this breaking require
that the energy of the superpartners not be too high; in these cases, supersymmetry is expected to be observed by
experiments at the Large Hadron Collider.
See also
• List of quantum field theories
• Constructive quantum field theory
• Feynman path integral
• Quantum chromodynamics
• Quantum electrodynamics
• Quantum flavordynamics
• Quantum geometrodynamics
• Quantum hydrodynamics
• Quantum magnetodynamics
• Quantum triviality
• Schwinger-Dyson equation
• Einstein-Maxwell-Dirac equations
• Relation between Schrodinger's equation and the path integral formulation of
Abraham-Lorentz force
Green's function (many-body theory)
Common integrals in quantum field theory
Wheeler-Feynman absorber theory
Wigner's theorem
Wigner's classification
Static forces and virtual-particle exchange
Photon polarization
Theoretical and experimental justification for the
Schrodinger equation
Invariance mechanics
Form factor
Green-Kubo relations
quantum mechanics
• Basic concepts of quantum mechanics
• Relationship between string theory and quantum field theory
Quantum field theory
210
Notes
[1] Weinberg, S. Quantum Field Theory, Vols. I to III, 2000, Cambridge University Press: Cambridge, UK.
[2] (http :// physics . princeton.edu/ -mcdonald/ examples/ QM/ thorn_aj p_72_ 121 0_04 . pdf)
[3] Dirac, P.A.M. (1927). The Quantum Theory of the Emission and Absorption of Radiation, Proceedings of the Royal Society of London, Series
A, Vol. 114, p. 243.
[4] Abraham Pais, Inward Bound: Of Matter and Forces in the Physical World ISBN 0-19-851997-4. Pais recounts how his astonishment at the
rapidity with which Feynman could calculate using his method. Feynman's method is now part of the standard methods for physicists.
Further reading
General readers:
• Feynman, R.P. (2001) [1964]. The Character of Physical Law. MIT Press. ISBN 0262560038.
• Feynman, R.P. (2006) [1985]. QED: The Strange Theory of Light and Matter. Princeton University Press.
ISBN 0691125759.
• Gribbin, J. (1998). Q is for Quantum: Particle Physics from A to Z. Weidenfeld & Nicolson. ISBN 0297817523.
• Schumm, Bruce A. (2004) Deep Down Things. Johns Hopkins Univ. Press. Chpt. 4.
Introductory texts:
• Bogoliubov, N.; Shirkov, D. (1982). Quantum Fields. Benjamin-Cummings. ISBN 0805309837.
• Frampton, P.H. (2000). Gauge Field Theories. Frontiers in Physics (2nd ed.). Wiley.
• Greiner, W; Muller, B. (2000). Gauge Theory of Weak Interactions. Springer. ISBN 3-540-67672-4.
• Itzykson, C; Zuber, J.-B. (1980). Quantum Field Theory. McGraw-Hill. ISBN 0-07-032071-3.
• Kane, G.L. (1987). Modern Elementary Particle Physics. Perseus Books. ISBN 0-201-1 1749-5.
• Kleinert, H.; Schulte-Frohlinde, Verena (2001). Critical Properties of qf -Theories (http://users.physik.
fu-berlin.de/~kleinert/re.html#B6). World Scientific. ISBN 981-02-4658-7.
• Kleinert, H. (2008). Multivalued Fields in Condensed Matter, Electrodynamics, and Gravitation (http://users.
physik.fu-berlin.de/~kleinert/public_html/kleiner_rebl 1/psfiles/mvf. pdf). World Scientific.
ISBN 978-981-279-170-2.
• Loudon, R (1983). The Quantum Theory of Light. Oxford University Press. ISBN 0-19-851 155-8.
• Mandl, F.; Shaw, G. (1993). Quantum Field Theory. John Wiley & Sons. ISBN 0-0471-94186-7.
• Peskin, M.; Schroeder, D. (1995). An Introduction to Quantum Field Theory. Westview Press.
ISBN 0-201-50397-2.
• Ryder, L.H. (1985). Quantum Field Theory. Cambridge University Press. ISBN 0-521-33859-X.
• Srednicki, Mark (2007) Quantum Field Theory, (http://www.cambridge.org/us/catalogue/catalogue.
asp?isbn=052 1864496) Cambridge Univ. Press.
• Yndurain, F.J. (1996). Relativistic Quantum Mechanics and Introduction to Field Theory (1st ed.). Springer.
ISBN 978-3540604532.
• Zee, A. (2003). Quantum Field Theory in a Nutshell. Princeton University Press. ISBN ISBN 0-691-01019-6.
Advanced texts:
• Bogoliubov, N.; Logunov, A. A.; Oksak, A.I.; Todorov, I.T. (1990). General Principles of Quantum Field Theory.
Kluwer Academic Publishers. ISBN 978-0792305408.
• Weinberg, S. (1995). The Quantum Theory of Fields. 1-3. Cambridge University Press.
Articles:
• Gerard 't Hooft (2007) " The Conceptual Basis of Quantum Field Theory (http://www.phys.uu.nl/~thooft/
lectures/basisqft.pdf)" in Butterfield, J., and John Earman, eds., Philosophy of Physics, Part A. Elsevier:
661-730.
• Frank Wilczek (1999) " Quantum field theory, (http://arxiv.org/abs/hep-th/9803075)" Reviews of Modern
Physics 71: S83-S95. Also doi=10.1103/Rev. Mod. Phys. 71 .
Quantum field theory
211
External links
• Stanford Encyclopedia of Philosophy: " Quantum Field Theory, (http://plato.stanford.edu/entries/
quantum-field-theory/)" by Meinard Kuhlmann.
• Siegel, Warren, 2005. Fields, (http://insti.physics.sunysb.edu/~siegel/errata.html) A free text, also available
from arXiv:hep-th/9912205.
• Pedagogic Aids to Quantum Field Theory (http://quantumfieldtheory.info). Click on "Introduction" for a
simplified introduction suitable for someone familiar with quantum mechanics.
• Free condensed matter books and notes (http://www.freebookcentre.net/Physics/Condensed-Matter-Books.
html).
• Quantum field theory texts (http://motls.blogspot.com/2006/01/qft-didactics.html), a list with links to
amazon.com.
• Quantum Field Theory (http://www.nat.vu.nl/~mulders/QFT-0.pdf) by P. J. Mulders
• Quantum Field Theory (http://damtp.cam.ac.uk/user/tong/qft/qft.pdf) by David Tong
• Quantum Field Theory Video Lectures (http://pirsa.org/index.php ?p=speaker&name=David_Tong) by David
Tong
• Quantum Field Theory Lecture Notes (http://www2.physics.utoronto.ca/~luke/PHY2403/References_files/
lecturenotes.pdf) by Michael Luke
• Quantum Field Theory Video Lectures (http://www.physics.harvard.edu/about/Phys253.html) by Sidney R.
Coleman
• Quantum Field Theory Lecture Notes (http://www.physics.gla.ac.uk/~drniller/lectures/RQF_l-6_2010.pdf)
by D.J. Miller
• Quantum Field Theory Lecture Notes II (http://www.physics.gla.ac.uk/~dmiller/lectures/RQF_7-9_2010.
pdf) by D.J. Miller
Scalar field theory
In theoretical physics, scalar field theory can refer to a classical or quantum theory of scalar fields. A field which is
invariant under any Lorentz transformation is called a "scalar", in contrast to a vector or tensor field. The quanta of
the quantized scalar field are spin-zero particles, and as such are bosons.
No fundamental scalar fields have been observed in nature, though the Higgs boson may yet prove the first example.
However, scalar fields appear in the effective field theory descriptions of many physical phenomena. An example is
the pion, which is actually a "pseudoscalar", which means it is not invariant under parity transformations which
invert the spatial directions, distinguishing it from a true scalar, which is parity-invariant. Because of the relative
simplicity of the mathematics involved, scalar fields are often the first field introduced to a student of classical or
quantum field theory.
In this article, the repeated index notation indicates the Einstein summation convention for summation over repeated
indices. The theories described are defined in flat, D-dimensional Minkowski space, with (D-l) spatial dimension
and one time dimension and are, by construction, relativistically covariant. The Minkowski space metric, , has a
particularly simple form: it is diagonal, and here we use the + sign convention.
Scalar field theory
212
Classical scalar field theory
Linear (free) theory
The most basic scalar field theory is the linear theory. The action for the free relativistic scalar field theory is
1 _ . 1
S = J d^xdtC = J d D ~ l xdt
/
= / d^xdf
where £ is known as a Lagrangian density. This is an example of a quadratic action, since each of the terms is
quadratic in the field, (j) . The term proportional to m 2 is sometimes known as a mass term, due to its interpretation
in the quantized version of this theory in terms of particle mass.
The equation of motion for this theory is obtained by extremizing the action above. It takes the following form,
linear in 0 :
rTd^ + ™*0 = dt<t> ~ V 2 0 + m 2 4> = 0
Note that this is the same as the Klein-Gordon equation, but that here the interpretation is as a classical field
equation, rather than as a quantum mechanical wave equation.
Nonlinear (interacting) theory
The most common generalization of the linear theory above is to add a scalar potential V^(0)to the equations of
motion, where typically, V is a polynomial in cp of order 3 or more (often a monomial). Such a theory is sometimes
said to be interacting, because the Euler-Lagrange equation is now is nonlinear, implying a self-interaction. The
action for the most general such theory is
S = J d D ~ 1 xdt£ = J d D ~ l xdt
/I 1 1 * 1
d^sdf -(drf) 2 - - -m 2 cf> 2 - J2 ~ y 9^
71=3
The n\ factors in the expansion are introduced because they are useful in the Feynman diagram expansion of the
quantum theory, as described below. The corresponding Euler-Lagrange equation of motion is
rTd^ + V\<t)) = d\<$> - V 2 0 + V'{tf>) = 0.
Dimensional analysis and scaling
Physical quantities in these scalar field theories may have dimensions of length, time or mass, or some combination
of the three. However, in a relativistic theory, any quantity t, with dimensions of time, can be 'converted' into a
length, I = ct i by using the velocity of light, c.
h
Similarly, any length / is equivalent to an inverse mass, [ = , using Planck's constant, ft . Heuristically, one
mc
can think of a time as a length, or either time or length as an inverse mass. In short, one can think of the dimensions
of any physical quantity as defined in terms of just one independent dimension, rather than in terms of all three. This
is most often termed the mass dimension of the quantity.
One objection is that this theory is classical, and therefore it is not obvious that Planck's constant should be a part of
the theory at all. In a sense this is a valid objection, and if desired one can indeed recast the theory without mass
dimensions at all. However, this would be at the expense of making the connection with the quantum scalar field
slightly more obscure. Given that one has dimensions of mass, Planck's constant is thought of here as an essentially
arbitrary fixed quantity with dimensions appropriate to convert between mass and inverse length. This is consistent
Scalar field theory
213
with the Feynman path integral approach to quantization, where the only reason for Planck's constant to appear stems
from the same type of dimensional argument, since the action must be divided by some parameter with these
dimensions to render the phase dimensionless.
Scaling Dimension
The classical scaling dimension, or mass dimension, A , of 0 describes the transformation of the field under a
rescaling of coordinates:
x — > Xx
<j> \~ A <p
The units of action are the same as the units of fa , and so the action itself has zero mass dimension. This fixes the
scaling dimension of 0 to be
Scale invariance
There is a specific sense in which some scalar field theories are scale-invariant. While the actions above are all
constructed to have zero mass dimension, not all actions are invariant under the scaling transformation
x — > Xx
The reason that not all actions are invariant is that one usually thinks of the parameters m and g n as fixed quantities,
which are not rescaled under the transformation above. The condition for a scalar field theory to be scale invariant is
then quite obvious: all of the parameters appearing in the action should be dimensionless quantities. In other words, a
scale invariant theory is one without any fixed length scale (or equivalently, mass scale) in the theory.
2D
For a scalar field theory with D spacetime dimensions, the only dimensionless parameter g n satisfies n = ^_ ^
. For example, in D=4 only 54 is classically dimensionless, and so the only classically scale-invariant scalar field
theory in £) = 4is the massless 0 4 theory. Classical scale invariance normally does not imply quantum scale
invariance. See the discussion of the beta function below.
Conformal invariance
A transformation
x —> x{x)
is said to be conformal if the transformation satisfies
for some function A 2 [x) • The conformal group contains as subgroups the isometries of the metric (the
Poincare group) and also the scaling transformations (or dilatations) considered above. In fact, the scale-invariant
theories in the previous section are also conformally-invariant.
Scalar field theory
214
<p 4 theory
Massive 0 4 theory illustrates a number of interesting phenomena in scalar field theory.
The Lagrangian density is
Spontaneous symmetry breaking
This Lagrangian has a Z2 symmetry under the transformation (j) — > —(f)
This is an example of an internal symmetry, in contrast to a space-time symmetry.
If TT^is positive, the potential V((f)) = — m 2 0 2 + — 0 4 has a single minimum, at the origin. The solution
2 4!
(j) = Ois clearly invariant under the Z2 symmetry. Conversely, if m 2 is negative, then one can readily see that the
potential V((b) = —m 2 (t> 2 + — (£ 4 has two minima. This is known as a double well potential, and the lowest
2 4!
energy states (known as the vacua, in quantum field theoretical language) in such a theory are not invariant under the
Z2 symmetry of the action (in fact it maps each of the two vacua into the other). In this case, the Z2 symmetry is
said to be spontaneously broken.
Kink solutions
The 0 4 theory with a negative m 2 also has a kink solution, which is a canonical example of a soliton. Such a
solution is of the form
f / . x m , (mix — Xq)\
where x is one of the spatial variables ( 0 is taken to be independent of t, and the remaining spatial variables). The
solution interpolates between the two different vacua of the double well potential. It is not possible to deform the
kink into a constant solution without passing through a solution of infinite energy, and for this reason the kink is said
to be stable. For ]J > 2 , i.e. theories with more than one spatial dimension, this solution is called a domain wall.
Another well-known example of a scalar field theory with kink solutions is the sine-Gordon theory.
Complex scalar field theory
In a complex scalar field theory, the scalar field takes values in the complex numbers, rather than the real numbers.
The action considered normally takes the form
S = J dP^xdtC = J d^xdt [rrWdvt ~ ^(I0| 2 )]
This has a U(l) symmetry, whose action on the space of fields rotates 0 — > e %OL (j) , for some real phase angle ot .
2
As for the real scalar field, spontaneous symmetry breaking is found if m is negative. This gives rise to a Mexican
hat potential which is analogous to the double-well potential in real scalar field theory, but now the choice of
vacuum breaks a continuous U(l) symmetry instead of a discrete one. This leads to a Goldstone boson.
0(N) theory
One can express the complex scalar field theory in terms of two real fields, (j) 1 = Re(j) and (j) 2 = Im(j) which
transform in the vector representation of the U(l) = O (2) internal symmetry. Although such fields transform as a
vector under the internal symmetry, they are still Lorentz scalars. This can be generalised to a theory of N scalar
fields transforming in the vector representation of the 0(N) symmetry. The Lagrangian for an O (N) -invariant scalar
field theory is typically of the form
Scalar field theory
215
£ = \rTW ' 14 ~ H<t> ' <l>)
using an appropriate O(N) -invariant inner product.
Quantum scalar field theory
In quantum field theory, the fields, and all observables constructed from them, are replaced by quantum operators on
a Hilbert space. This Hilbert space is built on a vacuum state, and dynamics are governed by a Hamiltonian, a
positive operator which annihilates the vacuum. A construction of a quantum scalar field theory may be found in the
canonical quantization article, which uses canonical commutation relations among the fields as a basis for the
construction. In brief, the basic variables are the field cp and its canonical momentum jt. Both fields are Hermitian .
At spatial points x, y at equal times, the canonical commutation relations are given by
[(j>{x),(f>{y)] = [7r(x),7r(y)] = 0, [0(f), 7r(y)] = i6(x-y),
and the free Hamiltonian is
H = Jd 3 x
A spatial Fourier transform leads to a momentum space fields
0(jfc) = J d 3 xe~ its (l)(x), 7r(fc) = J tPie^' 3 ^
which are used to define annihilation and creation operators
o(jfe) = (E<j>(k) + iir(jk)) , a + (jfc) = (E$(k) - m{k)) ,
where — \/jt 2 -|- m? • These operators satisfy the commutation relations
[a(fci), a(k 2 )] = [a + (fci), a\k 2 )] = 0, [aft), <J(k 2 )} = (2 7 r) 3 2^(fc 1 - fc 2 ).
The state I0> annihilated by all of the operators a is identified as the bare vacuum, and a particle with momentum
is created by applying a^(k)^° me vacuum. Applying all possible combinations of creation operators to the vacuum
constructs the Hilbert space. This construction is called Fock space. The vacuum is annihilated by the Hamiltonian
where the zero-point energy has been removed by Wick-ordering. (See canonical quantization.)
Interactions can be included by adding an interaction Hamiltonian. For a cp 4 theory, this corresponds to adding a
Wick-ordered term g\cp A \IA\ to the Hamiltonian, and integrating over x. Scattering amplitudes may be calculated from
this Hamiltonian in the interaction picture. These are constructed in perturbation theory by means of the Dyson
series, which gives the time-ordered products, or ^-particle Green's functions (0|T \(j){xi) • • • 0(x n )}|O) as
described in the Dyson series article. The Green's functions may also be obtained from a generating function that is
constructed as a solution to the Schwinger-Dyson equation.
Scalar field theory
216
Feynman Path Integral
The Feynman diagram expansion may be obtained also from the Feynman path integral formulation.^ The time
ordered vacuum expectation values of polynomials in cp, known as the ^-particle Green's functions, are constructed
by integrating over all possible fields, normalized by the vacuum expectation value with no external fields,
(O|T{0(x a ) • • • <f>{x n )}\0) = J V<Wi) W*nJe
All of these Green's functions may be obtained by expanding the exponential in J(x)cp(x) in the generating function
Z\J\ = [ v ^^*^^-^-l^+ J *) = z r 0 i y ho\T{<f>( Xl ) • ■ -0(x„)}|O).
A Wick rotation may be applied to make time imaginary. Changing the signature to (++++) then turns the Feynman
integral into a statistical mechanics partition function in Euclidean space,
Normally, this is applied to the scattering of particles with fixed momenta, in which case, a Fourier transform is
useful, giving instead
z[j] = J v^-sM^+^+t^-^).
The standard trick to evaluate this functional integral is to write it as a product of exponential factors, schematically,
-(^+mV/2 e -^ 4 /4! e -^'
Z[J\~ J ' V4>H[e-b 2+
P
The second two exponential factors can be expanded as power series, and the combinatorics of this expansion can be
represented graphically. The integral with X = 0 can be treated as a product of infinitely many elementary Gaussian
integrals, and the result may be expressed as a sum of Feynman diagrams, calculated using the following Feynman
rules:
• Each field in the n-point Euclidean Green's function is represented by an external line (half-edge) in the
graph, and associated with momentum p.
• Each vertex is represented by a factor -g.
k
• At a given order g , all diagrams with n external lines and k vertices are constructed such that the momenta
2 2
flowing into each vertex is zero. Each internal line is represented by a propagator l/(q + m ), where q is the
momentum flowing through that line.
• Any unconstrained momenta are integrated over all values.
• The result is divided by a symmetry factor, which is the number of ways the lines and vertices of the graph can be
rearranged without changing its connectivity.
• Do not include graphs containing "vacuum bubbles", connected subgraphs with no external lines.
The last rule takes into account the effect of dividing by Z[0] • The Minkowski-space Feynman rules are similar,
2 2
except that each vertex is represented by -ig, while each internal line is represented by a propagator i/(q -m + ie),
where the 'e term represents the small Wick rotation needed to make the Minkowski-space Gaussian integral
converge.
Scalar field theory
217
Renormalization
The integrals over unconstrained momenta, called "loop integrals", in the Feynman graphs typically diverge. This is
normally handled by renormalization, which is a procedure of adding divergent counter-terms to the Lagrangian in
T21
such a way that the diagrams constructed from the original Lagrangian and counter-terms is finite. A
renormalization scale must be introduced in the process, and the coupling constant and mass become dependent upon
it.
The dependence of a coupling constant g on the scale X is encoded by a beta function, (3(g), defined by the relation
This dependence on the energy scale is known as the running of the coupling parameter, and theory of this kind of
scale-dependence in quantum field theory is described by the renormalization group.
Beta-functions are usually computed in an approximation scheme, most commonly perturbation theory, where one
assumes that the coupling constant is small. One can then make an expansion in powers of the coupling parameters
and truncate the higher-order terms (also known as higher loop contributions, due to the number of loops in the
corresponding Feynman graphs).
The beta- function at one loop (the first perturbative contribution) for the 0 4 theory is
The fact that the sign in front of the lowest-order term is positive suggests that the coupling constant increases with
energy. If this behavior persists at large couplings, this would indicate the presence of a Landau pole at finite energy,
or quantum triviality. The question can only be answered non-perturbatively, since it involves strong coupling.
A quantum field theory is trivial when the running coupling, computed through its beta function, goes to zero when
the cutoff is removed. Consequently, the propagator becomes that of a free particle and the field is no longer
interacting. Alternatively, the field theory may be interpreted as an effective theory, in which the cutoff is not
removed, giving finite interactions but leading to a Landau pole at some energy scale. For a cp 4 interaction, Michael
Aizenman proved that the theory is indeed trivial for space-time dimension D > 5.^ For f) = 4 the triviality
has yet to be proven rigorously, but lattice computations have confirmed this. (See Landau pole for details and
references.) This fact is relevant as the Higgs field, for which triviality bounds are used to set limits on the Higgs
mass, based on the new physics must enter at a higher scale (perhaps the Planck scale) to prevent the Landau pole
from being reached.
References
[1] A general reference for this section is Ramond, Pierre (2001-12-21). Field Theory: A Modern Primer (Second Edition). USA: Westview
Press. ISBN 0201304503..
[2] See the previous reference, or for more detail, Itzykson, Zuber; Zuber, Jean-Bernard (2006-02-24). Quantum Field Theory. Dover.
ISBN 0070320713..
[3] Aizenman, M. (1981). "Proof of the Triviality of (p ^ Field Theory and Some Mean-Field Features of Ising Models for d>4". Physical
Review Letters 47: 1-4. doi:10.1103/PhysRevLett.47.1.
Scalar field theory
218
Further reading
• Peskin, M and Schroeder, D. \An Introduction to Quantum Field Theory, Westview Press (1995)
• Weinberg, Steven ; The Quantum Theory of Fields, (3 volumes) Cambridge University Press (1995)
• Srednicki, Mark; Quantum Field Theory, Cambridge University Press (2007)
• Zinn- Justin, Jean ; Quantum Field Theory and Critical Phenomena, Oxford University Press (2002)
External links
• Pedagogic Aides to Quantum Field Theory (http://www.quantumfieldtheory.info) Click on the link for Chap. 3
to find an extensive, simplified introduction to scalars in relativistic quantum mechanics and quantum field
theory.
• 't Hooft, G., "The Conceptual Basis of Quantum Field Theory" ( online version (http://www.phys.uu.nl/
-thooft/lectures/basisqft.pdf)).
Yang-Mills theory
Yang-Mills theory is a gauge theory based on the SU(N) group. Wolfgang Pauli formulated in 1953 the first
consistent generalization of the five-dimensional theory of Kaluza, Klein, Fock and others to a higher dimensional
internal space. ^ Because Pauli saw no way to give masses to the gauge bosons, he refrained from publishing his
results formally J 1 ^
Although Pauli did not publish this theory, he gave talks widely attended by physicists of the time. In early 1954,
Yang and Mills developed a modern formulation in an effort to extend the original concept of gauge theory for
abelian groups, e.g. quantum electrodynamics, to nonabelian groups to provide an explanation for strong
interactions. This initial idea was not a success, since the quanta of the Yang-Mills field must be massless in order to
maintain gauge invariance. The massless particles should have long range effects, but these effects are not seen in
experiments. The idea was set aside until 1960, when the concept of particles acquiring mass through symmetry
breaking in massless theories was put forward, initially by Jeffrey Goldstone, Yoichiro Nambu, and Giovanni
Jona-Lasinio.
This prompted a significant restart of Yang-Mills theory studies that proved successful in the formulation of both
electro weak unification and quantum chromodynamics (QCD). The electro weak interaction is described by
SU(2)xU(l) group while QCD is an SU(3) gauge theory. The electro weak theory is obtained by combining SU(2)
with U(l), where quantum electrodynamics (QED) is described by a U(l) group, and is replaced in the unified
electro weak theory by a U(l) group representing a weak hypercharge rather than electric charge. The massless
bosons from the SU(2)xU(l) theory mix after spontaneous symmetry breaking to produce the 3 massive weak
bosons, and the photon field. The Standard Model combines the strong interaction, with the unified electroweak
interaction (unifying the weak and electromagnetic interaction) through the symmetry group SU(2)xU(l)xSU(3). In
the current epoch the strong interaction is not unified with the electroweak interaction, but from the observed
running of the coupling constants it is believed they all converge to a single value at very high energies.
Phenomenology at lower energies in quantum chromodynamics is not completely understood due to the difficulties
of managing such a theory with a strong coupling. This is the reason confinement has not been theoretically proven,
though it is a consistent experimental observation. Proof that QCD confines at low energy is a mathematical problem
of great relevance, and an award has been proposed by the Clay Mathematics Institute for whoever is able to show
that the Yang-Mills theory has a mass gap.
Yang-Mills theory
219
Mathematical overview
Yang-Mills theories are a special example of gauge theory with symmetry non-abelian group given by the
Lagrangian
£, g , = -h T r(F 2 ) = -^ a F^
with the generators of the Lie algebra corresponding to the F-quantities (the curvature or field- strength form)
satisfying
[T a ,T b ] =if abc T c
and the covariant derivative defined as
Dp = Idp - i 9 T a Al
where J is the identity for the group generators, A a ^ is the vector potential, and Q is the coupling constant. In four
dimensions, the coupling constant Q is a pure number and for a SU(N) group one has a, 6, c = 1 . . . TV 2 — 1.
The relation
F; v = dpA* v - d u A; + gf^A^Al
can be derived by the commutator
The field has the property of being self-interacting and equations of motion that one obtains are said to be semilinear,
as nonlinearities are both with and without derivatives. This means that one can manage this theory only by
perturbation theory, with small nonlinearities.
Note that the transition between "upper" ("contravariant") and "lower" ("covariant") vector or tensor components is
trivial for a indices (e.g. f abc = f abc ), whereas for \i and v it is nontrivial, corresponding e.g. to the usual Lorentz
signature, rj^ = diag (H ) .
From the given Lagrangian one can derive the equations of motion given by
d^F^ + gf abc A" b F^ = ^
Putting Ffu, = T a F^^ these can be rewritten as
{D^F^f = 0.
A Bianchi identity holds
{D^Y + {DnF^Y + (D v F^) a = 0.
A source enters into the equations of motion as
d»Fz v + g r bc A» b F; v = -r v .
Note that the currents must properly change under gauge group transformations.
Quantization of Yang-Mills theory
The most appropriate method to quantize the Yang-Mills theory is by functional methods, i.e. path integrals. One
introduces a generating functional for n-point functions as
Z ^ = J |^4je _ 4 / ^ 4a; Tr( J F ,//I/ i^ Miy )-|-i / d 4 x j^(x)A afJ '(x)
but this integral has no meaning as is because the potential vector can be arbitrarily chosen due to the gauge freedom.
This problem was already known for quantum electrodynamics but here becomes more severe due to non-abelian
properties of the gauge group. A way out has been given by Ludvig Faddeev and Victor Popov with the introduction
of a ghost field (see Faddeev-Popov ghost) that has the property of being unphysical since, although it agrees with
Fermi-Dirac statistics, it is a complex scalar field, violating in this way the spin-statistics theorem. So, we can write
Yang-Mills theory
220
the generating functional as
ZfcM] = j [dA][dc][dc]eH
e i f d 4 xj£ { X )A<W (x)+i f d*x[c» (x)s a (x)c a {x)]
that is the expression commonly used to derive Feynman's rules (see Feynman diagram). Here we have c a f° r me
ghost field while a fixes the gauge's choice for the quantization. Feynman's rules obtained from this functional are
the following
, P
01/ 'WAAAA/ a l J
gluon propagator: D?*(p) =
-iS ab
p 2 + i0
(q - r)tf*x + (r - p)*ifcx]
4-gluon vertex: l^L = H^/*/*0wifc* -ifcrffc a)
ghost propagator: C° (p) -
p 2 + i0
c .-^ a
CCS - vertex: F^p) = gf abc Pfl
These rules for Feynman diagrams are easily obtained when we realize that the generating functional given above
can be rewritten as
-ig fd 4 x * , f abc du— r— r#r -*9 f d 4 xf abc duT^-^— r~ rln
Z[7, e, e] = e i*e°(aO-' ^Ju*) 5e ( E ) e J '■WW *3 CV W x
2
^ t 4 j dTxf f sjb ^ x) Sj c (x) Sj r^ x) Sjav{x) _^ ^
e
being
Z 0 \j,E,e] = E~ J ^^y^{x)C-\x-y)e\y) e \ j ^xd 4 yj-(x)D^(x- y )jUy)
the generating functional of the free theory. Expanding in 9 and computing the functional derivatives, we are able to
obtain all the n-point functions with perturbation theory. Using LSZ reduction formula we get from the n-point
functions the corresponding amplitudes for the given processes and cross sections and decay rates are promptly
obtained. The theory is renormalizable and corrections are finite at any order of perturbation theory.
For quantum electrodynamics, being in this case abelian the gauge group (see abelian group), the ghost field
decouples. This can be easily realized when we look at the coupling between the gauge field and the ghost field that
is c a / a6c fl l pA bM C c . For the Abelian case all the structure constants f abc are zero and so there is no coupling. In
the non- Abelian case, the ghost field appears as a useful way to rewrite the quantum field theory without physical
consequences on the observables of the theory as cross sections or decay rates.
Yang-Mills theory
221
One of the most important results obtained for Yang-Mills theory is asymptotic freedom. This result can be obtained
assuming the coupling constant 9 as small (so small nonlinearities), as indeed happens to high energies, and
applying perturbation theory. The relevance of this result is due to the fact that a Yang-Mills theory describes strong
interactions and asymptotic freedom permits to treat properly experimental results coming from deep inelastic
scattering.
In order to obtain the behavior at high energies of the Yang-Mills theory, and so to prove asymptotic freedom, one
does perturbation theory assuming a small coupling. This is verified a posteriori in the ultraviolet limit. In the
opposite limit, infrared limit, the situation is quite the opposite being the coupling too large for perturbation theory to
be reliable. Indeed, most of the difficulties that current research meets is just managing the theory at low energies
that is the interesting one being inherent to the description of hadronic matter and, more generally, to all the observed
bound states of gluons and quarks and their confinement (see hadrons). Then, the most used method to study the
theory in this limit is to try to solve it on computers (see lattice gauge theory). In this case, large computational
resources are needed to be sure the right limit of infinite volume (smaller lattice spacing) is hit. This is the limit the
results have to be compared with. Smaller spacing and larger coupling are not independent each others and to
accomplish both larger computational resources are demanded. As for today, the situation appears somewhat
satisfactory for the hadronic spectrum and the computation of the gluon and ghost propagators but the glueball and
hybrids spectra are yet a questioned matter also in view of the experimental observation of such exotic states. Indeed,
the a resonance^ ^ is not seen in any of such lattice computations and contrasting interpretations have been put
forward. This is currently a hotly debated issue.
Beta function
One of the key properties of a quantum field theory is the behavior at all the energy range of the running coupling.
Such a behavior can be obtained from a theory once its beta function is known. Our ability of extracting results from
a quantum field theory relies on perturbation theory. Once the beta function is known, the behavior at all energy
scales of the running coupling is obtained through the equation
being a s = <? 2 /47r . Yang-Mills theory has the property of being asymptotically free in the large energy limit
(ultraviolet limit). This means that, in this limit, beta function has a minus sign driving the behavior of the running
coupling toward even smaller values as the energy increases. Perturbation theory permits to evaluate beta function in
this limit producing the following result for SU(N)
nl \ HJV 2 UN 2 3 , 4 ,
In the opposite limit of low energies (infrared limit), beta function is not known. It is note the exact one for a
supersymmetric Yang— Mills theory. This has been obtained by Novikov, Shifman, Vainshtein and Zakharov^ and
can be written as
PM = -
With this starting point, Thomas Ryttov and Francesco Sannino were able to postulate a non-supersymmetric version
UN 1
i"7]
of it writing down
127T 1 ±L£Li
As can be seen from the beta function of the supersymmetric theory, the limit of a large coupling (infrared limit)
implies
Yang-Mills theory
222
and so the running coupling in the deep infrared limit goes to zero making this theory trivial. This implies that the
coupling reaches a maximum at some value of the energy turning again to zero as the energy is lowered. Then, if
Ryttov and Sannino hypothesis is correct, the same should be true for ordinary Yang-Mills theory. This would be in
agreement with recent lattice computations .
Open problems
The Yang-Mills theories were generally acknowledged in the physics community after Gerard 't Hooft, in 1972,
could prove their renormalizability. This applies even if the gauge bosons described by this theory are massive, as in
the electroweak theory. However, the mass is only an "acquired" one, namely, as suggested, by the famous Higgs
mechanism.
Concerning the mathematics, it should be noted that presently, i.e. in 2009, the Yang-Mills theory is a very active
field of research, yielding e.g. a classification of differentiable structures of four-dimensional manifolds by Simon
Donaldson. Furthermore, the field of Yang-Mills theories was included in the Clay Mathematics Institute's list of
"Millennium Prize Problems". Here the prize-problem consists, especially, in a proof of the conjecture that the
lowest excitations of a pure Yang-Mills theory (i.e. without matter fields) have a finite mass-gap with regard to the
vacuum state. Another open problem, connected with this conjecture, is a proof of the confinement property in the
presence of additional Fermion particles.
In physics the survey of Yang-Mills theories does not usually start from perturbation analysis or analytical methods,
but more recently from systematic application of numerical methods to lattice gauge theories.
References
[1] Straumann, N: "On Pauli's invention of non-abelian Kaluza-Klein Theory in 1953" eprint arXiv.gr=qc/00 12054
[2] See Abraham Pais' account of this period as well as L. Susskind's "Superstrings, Physics World on the first non-abelian gauge theory" where
Susskind wrote that Yang-Mills was "rediscovered" only because Pauli had chosen not to publish
[3] Yang, C. N.; Mills, R. (1954), "Conservation of Isotopic Spin and Isotopic Gauge Invariance", Physical Review 96 (1): 191-195,
doi: 1 0. 1 1 03/Phy sRev.96. 1 9 1
[4] Caprini, L; Colangelo, G.; Leutwyler, H. (2006), "Mass and width of the lowest resonance in QCD", Physical Review Letters (13): 132001,
doi: 10. 1 1 03/PhysRevLett.96. 1 32001
[5] Yndurain, F. J.; Garcia-Martin, R.; Pelaez, J. R. (2007), "Experimental status of the 7T7T isoscalar S wave at low energy: Jo (600) pole
and scattering length", Physical Review D 76 (7): 074034, doi:10.1103/PhysRevD.76.074034
[6] Novikov, V. A.; Shifman, M. A.; A. I. Vainshtein, A. L; Zakharov, V. I. (1983), "Exact Gell-Mann-Low Function Of Supersymmetric
Yang-Mills Theories From Instanton Calculus", Nuclear Physics B 229 (2): 381-393, doi:10.1016/0550-3213(83)90338-3
[7] Ryttov, T.; Sannino, F. (2008), "Supersymmetry Inspired QCD Beta Function", Physical Review D 78 (6): 065001,
doi:10.1103/PhysRevD.78.065001
[8] Bogolubsky, I. L.; Ilgenfritz, E.-M.; A. I. Muller-Preussker, M.; Sternbeck, A. (2009), "Lattice gluodynamics computation of Landau-gauge
Green's functions in the deep infrared", Physics Letters B 676 (1-3): 69-73, doi:10.1016/j.physletb.2009.04.076
Yang-Mills theory
223
Further reading
Books
• Frampton, P. (2008). Gauge Field Theories (3rd ed.). Wiley-VCH. ISBN 978-3527408351.
• Cheng, T.-P.; Li, L.-F. (1983). Gauge Theory of Elementary Particle Physics. Oxford University Press.
ISBN 0-19-851961-3.
• 't Hooft, Gerardus (2005). 50 Years of Yang-Mills theory. World Scientific. ISBN 981-238-934-2.
Articles
• Svetlichny, George (1999). " Preparation for Gauge Theory (http://arxiv.org/abs/math-ph/9902027)".
• Gross, D. (1992). "Gauge theory - Past, Present and Future" (http://psroc.phys.ntu.edu.tw/cjp/v30/955.pdf).
Retrieved 2009-04-23.
External links
• Yang-Mills theory on Dispersive Wiki (http://tosio.math.toronto.edu/wiki/index.php/Yang-Mills_equations)
• The Clay Mathematics Institute (http://www.claymath.org)
• The Millennium Prize Problems (http://www.claymath.org/prizeproblems)
Yangian
Yangian is an important structure in modern representation theory, a type of a quantum group with origins in
physics. Yangians first appeared in the work of Ludvig Faddeev and his school concerning the quantum inverse
scattering method in the late 1970s and early 1980s. Initially they were considered a convenient tool to generate the
solutions of the quantum Yang-Baxter equation. The name Yangian was introduced by Vladimir Drinfeld in 1985 in
honor of C.N. Yang.
Description
For any finite-dimensional semisimple Lie algebra a, Drinfeld defined an infinite-dimensional Hopf algebra Y(a),
called the Yangian of a. This Hopf algebra is a deformation of the universal enveloping algebra U(a[z]) of the Lie
algebra of polynomial loops of a given by explicit generators and relations. The relations can be encoded by
identities involving a rational /^-matrix. Replacing it with a trigonometric /^-matrix, one arrives at affine quantum
groups, defined in the same paper of Drinfeld.
In the case of the general linear Lie algebra gl^ the Yangian admits a simpler description in terms of a single ternary
(or RTT) relation on the matrix generators due to Faddeev and coauthors. The Yangian Y(gl^ is defined to be the
algebra generated by elements 1 < i, j < N and p > 0, subject to the relations
Defining t\ - ^ — Sij , setting
P >-i
and introducing the R-matrix R(z) = I + z -1 P on ® C^, where P is the operator permuting the tensor factors, the
above relations can be written more simply as the ternary relation:
R 23 {z - w)T 12 {z)T 13 (w) = T 13 (w)T 12 (z)R 23 {z - w).
The Yangian becomes a Hopf algebra with comultiplication A, counit 8 and antipode s given by
(A ® id)T(z) = T 12 {z)T 13 (z), (e <g> id)T(z) = /, (s <g> id)T(z) = T(z)-\
Yangian
224
At special values of the spectral parameter {z — w), the /^-matrix degenerates to a rank one projection. This can be
used to define the quantum determinant of T{z) , which generates the center of the Yangian.
The twisted Yangian Y~(g/ ), introduced by G. I. Olshansky, is the sub-Hopf algebra generated by the coefficients
of
S{z)=T(z)aT(-z),
where a is the involution of gl 2N given by
a (Eij) — ( — l)* +J ^2JV-j + l J 2JV-i+l-
Applications to classical representation theory
G.I. Olshansky and I.Cherednik discovered that the Yangian of g/^is closely related with the branching properties of
irreducible finite-dimensional representations of general linear algebras. In particular, the classical Gelfand-Tsetlin
construction of a basis in the space of such a representation has a natural interpretation in the language of Yangians,
studied by M.Nazarov and V.Tarasov. Olshansky, Nazarov and Molev later discovered a generalization of this
theory to other classical Lie algebras, based on the twisted Yangian.
Applications to physics
Yangian appears as a symmetry group in different models in physics. The most famous one is super- symmetric
Yang-Mills field in four dimensions. Yangian also appears as a symmetry group of one dimensional exactly solvable
models such as spin chains, Hubbard model ^ and in models of one dimensional relativistic quantum field theory.
Representation theory of Yangians
Irreducible finite-dimensional representations of Yangians were parametrized by Drinfeld in a way similar to the
highest weight theory in the representation theory of semisimple Lie algebras. The role of the highest weight is
played by a finite set of Drinfeld polynomials. Drinfeld also discovered a generalization of the classical Schur-Weyl
duality between representations of general linear and symmetric groups that involves the Yangian of sl N and the
degenerate affine Hecke algebra (graded Hecke algebra of type A, in George Lusztig's terminology).
Representations of Yangians have been extensively studied, but the theory is still under active development.
References
• Chari, Vyjayanthi; Andrew Pressley (1994). A Guide to Quantum Groups. Cambridge, U.K.: Cambridge
University Press. ISBN 0-521-55884-0.
• Drinfel'd, Vladimir Gershonovich (1985). M Ajire6pti Xonc})a h KBaHTOBoe ypaBHemie 5mra-EaKCTepa [Hopf
algebras and the quantum Yang-Baxter equation]" (in Russian). Doklady Akademii Nauk SSSR 283 (5):
1060-1064.
• Drinfel'd, V. G. (1987). "[A new realization of Yangians and of quantum affine algebras]" (in Russian). Doklady
Akademii Nauk SSSR 296 (1): 13-17. Translated in Soviet Mathematics - Doklady 36 (2): 212-216. 1988.
• Drinfel'd, V. G. (1986). "Btipo^eHHtie acJxjMHHtie ajire6pti Teicxe h ifflrciaHbi [Degenerate affine Hecke algebras
and Yangians]" (in Russian). Funktsional'nyi Analiz i Ego Prilozheniya 20 (1): 69-70. MR831053,
Zbl: 0599.20049. Translated in Drinfel'd, V. G. (1986). "Degenerate affine hecke algebras and Yangians".
Functional Analysis and Its Applications 20 (1): 58-60. doi:10.1007/BF01077318.
• Molev, Alexander Ivanovich (2007). Yangians and Classical Lie Algebras. Mathematical Surveys and
Monographs. Providence, RI: American Mathematical Society. ISBN 978-0-8218-4374-1.
Yangian
225
References
[1] http://arxiv.org/pdf/hep-th/93 10158
[2] http://www.mathnet.ru/php/getFT.phtml?jrnid=faa&paperid=1254&volume=20&year=1986
option_lang=eng
Quantum spacetime
In mathematical physics quantum spacetime is the proposal that the actual spacetime that we live in is more
accurately described not by usual local coordinates j/ 5 z, £ but operator or algebra variables where the order of at
least some of the products matters. The idea is borrowed from the canonical commutation relations in quantum
mechanics where position and momentum variables ^,Pare mutually noncommutative, but is postulated now for
relations between one or more of the spacetime variables themselves. Just as with Heisenberg's uncertainty principle,
a quantum spacetime necessarily comes with uncertainty relations in the sense that not all :r, y, z, £ can
simultaneously be ascribed actual numerical values.
There are fundamental physical reasons to believe that spacetime is better modeled in this way. Because of wave
particle duality the energy needed to probe smaller and smaller distances would be greater and greater, until the test
particles doing the probing would form black holes and destroy the very geometry trying to be measured. This means
that our usual picture of continuum spacetime must itself break down as we approach such Planck scale distances.
Quantum spacetime is a particular attempt to address this using ideas from quantum mechanics and is plausibly
expected on the grounds that the corrections to geometry are being induced by quantum gravity.
The mathematics allowing such a physical possibility has been developed under the general headings of
noncommutative geometry and quantum geometry. In practice the more well-known Connes approach to
noncommutative geometry has not been applied so much here and the currently best-known models of quantum
spacetime fall within a more pedestrian quantum groups approach to noncommutative geometry, based on symmetry.
It is important to insist that any noncommutative algebra with four generators qualifies properly to be a called
quantum spacetime. One should require at least the following:
• There should be a plausible expectation that such an algebra might actually arise in an effective description of
quantum gravity effects in some regime of that theory. In this context there should be a parameter \ , say,
controlling the extent of deviation from ordinary spacetime and ultimately identifiable with physical constants
such as the Planck length. One should obtain ordinary Lorentzian spacetime as \ _ > 0 .
• Local Lorentz group and Poincare group symmetries should be retained in some sufficient but possibly
generalised form. These symmetries of ordinary spacetime are needed for the formulation of Special Relativity
and their generalisation often takes the form of a quantum group acting on the quantum spacetime algebra.
• There should be a notion of quantum differential calculus on the quantum spacetime algebra, compatible with the
(quantum) symmetry and preferably reducing to classical high school differential calculus as \ —> Q .
These and some partial understanding of the rest of the story are the minimum needed to have wave equations for
particles and fields and hence first predictions for the deviations from classical spacetime physics. This in turn is
needed if the theory is ever to be tested.
Several models were found in the 1990s meeting the above criteria to lesser or greater extents. The most important of
these has currently testable predictions making the search for quantum spacetime not only a theoretical fancy but a
potentially an actual new discovery about our physical world.
Quantum spacetime
226
Bicrossproduct model spacetime
Was introduced in 1994 by Shahn Majid and Henri Ruegg^ and has relations
for spatial variables X{ and the one time variable t • Here \ has dimensions of length and is therefore expected to
be something like the Planck length. The Poincare group here is correspondingly deformed, now to a certain
bicrossproduct quantum group with the following characteristic features.
The momentum generators Pi commute among themselves but
addition of momenta, reflected in the quantum group structure, is
deformed (momentum space becomes a non-abelian group).
Meanwhile, the Lorentz group generators enjoy their usual relations
among themselves but act non-linearly on the momentum space. The
orbits for this action are depicted in the figure as a cross-section of Po o
against one of the Pi . The on- shell region describing particles in the
upper centre of the image would normally be hyperboloids but these
are now v squashed up' into the cylinder
-2-10 1 2
Orbits for the action of the Lorentz group on
momentum space in the construction of the
bicrossproduct model in units of \~ 1 .
Mass-shell hyperboloids are v squashed' into a
cylinder.
in simplified units. The upshot is that as you try to Lorentz-boost the momentum of a particle you will never exceed
the Planck momentum. The existence of a highest momentum scale or lowest distance scale fits the physical picture.
The existence of such squashing behaviour comes from the non-linearity of the action and is an endemic feature of
bicrossproduct quantum groups known since their introduction in 1988 . Some physicists have been so impressed
by this feature of the bicrossproduct model that they have dubbed it doubly special relativity but such nomenclature
remains controversial and disputed.
Another consequence of the squashing is that the propagation of particles is deformed, even of light, leading to a
variable speed of light prediction. Key to this prediction was a reason to believe that the particular Po?Piare
plausibly the physical energy and spatial momentum (as opposed to some other function of them), provided in 1999
T31
by Giovanni Amelino-Camelia and Majid through a study of plane waves for a quantum differential calculus in the
model. They take the form
in other words a form which is sufficiently close to classical that one might plausibly believe the interpretation. At
the moment such wave analysis represents the best hope to obtain physically testable predictions form the model.
Prior to this work there were a number of unsupported claims to make predictions from the model based solely on
the form of the Poincare quantum group. There were also claims based on an earlier ft -Poincare quantum group
Mi
introduced by Jurek Lukierski and co-workers which should be viewed as an important precursor to the
bicrossproduct one, albeit without the actual quantum spacetime and with different proposed generators for which
the above picture does not apply. The bicrossproduct model spacetime has also been called ft -deformed spacetime
with ft = A" 1
Quantum spacetime
227
(/-Deformed spacetime
Was introduced independently by a team^ working under Julius Wess in 1990 and by Majid and coworkers in a
series of papers on braided matrices starting a year later^ . The point of view in the second approach is that usual
Minkowski spacetime has a nice description via Pauli matrices as the space of 2 x 2 hermitian matrices. In quantum
group theory and using braided monoidal category methods one has a natural q- version of this defined here for real
values of q as a v braided hermitian matrix' of generators and relations
(7 s) = (7 P a = ^ M] = °' [/3 ' 7] = (i-'TX*-")' ftfl = (l-T 2 )^
These relations say that the generators commute as q —> 1 thereby recovering usual Minkowski space. One can
work with more familiar variables 2, y , z, £ as linear combinations of these. In particular, time
t = Trace g ^ = q8 + q~ x a
is given by a natural braided trace of the matrix and commutes with the other generators (so this model has a very
different flavour from the bicrossproduct one). The braided-matrix picture also leads naturally to a quantity
^(7 s)
which as q —> 1 returns us the usual Minkowski distance (this translates to a metric in the quantum differential
geometry). The parameter q = e X or q = e lA is dimensionless and \ is thought to be a ratio of the Planck scale
and the cosmological length. That is, there are indications that that this model relates to quantum gravity with
non-zero cosmological constant, the choice of q depending on whether this is positive or negative. We have
described the mathematically better understood but perhaps less physically justified positive case here.
A full understanding of this model requires (and was concurrent with the development of) a full theory of v braided
linear algebra' for such spaces. The momentum space for the theory is another copy of the same algebra and there is
a certain "braided addition' of momentum on it expressed as the structure of a braided Hopf algebra or quantum
group in a certain braided monoidal category). This theory by 1993 had provided the corresponding q -deformed
Poincare group as generated by such translations and q -Lorentz transformations, completing the interpretation as a
quantum spacetime .
In the process it was discovered that the Poincare group not only had to be deformed but had to be extended to
include dilations of the quantum spacetime. For such a theory to be exact we would need all particles in the theory to
be massless, which is consistent with experiment as masses of elementary particles are indeed vanishingly small
compared to the Planck mass. If current thinking in cosmology is correct then this model is more appropriate, but it
is significantly more complicated and for this reason its physical predictions have yet to be worked out.
Fuzzy or spin model spacetime
Refers in modern usage to the angular momentum algebra
[2:1,2:2] = 2zAx3, [2:2,2:3] = 2iAxi, [2:3,2:1] = 2iXx2
familiar from quantum mechanics but regarded now as coordinates of a quantum space or spacetime. This is
primarily of interest as a toy model of quantum gravity where we work only in 3 spacetime dimensions (not the
correct 4) and here presented with a Euclidean not Lorentzian signature. It was first proposed in this context by
Geradus 't Hooft while its full development including a quantum differential calculus and an action of a certain
rm
v quantum double' quantum group as deformed Euclidean group of motions was obtained by Majid and E. Batista
A striking feature of the noncommutative geometry here is that the smallest covariant quantum differential calculus
has one dimension higher than expected, namely 4, suggesting that the above can also be viewed as the spatial part
of a 4-dimensional quantum spacetime. The model should not be confused with fuzzy spheres which are
Quantum spacetime
228
finite-dimensional matrix algebras which one can think of as spheres in the spin model spacetime of fixed radius.
Heisenberg model spacetimes
The first concrete model of quantum spacetime is often attributed to Hartland Snyder in 1947, however his paper'- 10 -'
actually proposes that
where generate and are interpreted as the Lorentz group. This is not an actual quantum spacetime in the sense
above because the spacetime coordinates %fi do not form a self-contained algebra among themselves, rather Snyder
was proposing a radical unification of spacetime with the Lorentz and Poincare groups.
The idea was revived in a modern context by Sergio Doplicher, Claus Fredenhagen and John Roberts in 1995 '- 11 -' by
letting M^v simply be viewed as some function of %fi as defined by the above relation, and any relations involving
it viewed as higher order relations among the . The Lorentz symmetry is arranged so as to transform the indices
as usual and without being deformed.
An even simpler variant of this model is to let M nere be a numerical antisymmetric tensor, in which context it is
usually denoted Q , so the relations are [x^, xj\ = i6^ u . In even dimensions ]J any nondegenerate such theta
can be transformed to a normal form in which this really is just the Heisenberg algebra but the difference that the
variables are being proposed as those of spacetime. This proposal was for a time quite popular because of its familiar
form of relations and because it has been argued that it emerges from the theory of open strings landing on
D-branes, see noncommutative quantum field theory and Moyal plane. However, it should be realised that this
D-brane lives in some of the higher spacetime dimensions in the theory and hence it is not our physical spacetime
that string theory suggests to be effectively quantum in this way. You also have to subscribe to D-branes as an
approach to quantum gravity in the first place. Even when posited as quantum spacetime it is hard to obtain physical
predictions and one reason for this is that if Q is a tensor then by dimensional analysis it should have dimensions of
length 2 , and if this length is speculated to be the Planck length then the effects would be even harder to ever detect
than for other models.
Noncommutative extensions to spacetime
Although not quantum spacetime in the sense above, another use of noncommutative geometry is to tack on
v noncommutative extra dimensions' at each point of ordinary spacetime. Instead of invisible curled up extra
dimensions as in string theory, Alain Connes and coworkers have argued that the coordinate algebra of this extra part
should be replaced by a finite-dimensional noncommutative algebra. For a certain reasonable choice of this algebra,
its representation and extended Dirac operator, one is able to recover the Standard Model of elementary particles. In
this point of view the different kinds of matter particles are manifestations of geometry in these extra
noncommutative directions. Connes first works here date from 1989 but has been developed considerably since
then. Such an approach can theoretically be combined with quantum spacetime as above.
Quantum spacetime
229
References
[I] Majid, S.; Ruegg, H. (1994), "Bicrossproduct structure of the AC -Poincare group and noncommutative geometry", Physics Letters B 334:
348-354, doi:10.1016/0370-2693(94)90699-8
[2] Majid, Shahn (1988), "Hopf algebras for physics at the Planck scale", Classical and Quantum Gravity 5: 1587-1607,
doi: 10. 1088/0264-9381/5/12/010
[3] Amelino-Camelia, G.; Majid, S. (2000), "Waves on noncommutative spacetime and gamma-ray bursts", International J. Mod. Phys. A 15:
4301-4323
[4] Lukierski, J; Nowicki, A; Ruegg, H; Tolstoy, V.N. (1991), " Q -Deformation of Poincare algebras", Physics Letters B 268: 331-338
[5] Carow-Watamura, U.; Schlieker, M.; Scholl, M.; Watamura, S. (1990), "Tensor representation of the quantum group S Lq (2, C) and
quantum Minkowski space", Z. Phys. C 48: 159, doi:10.1007/BF01565619
[6] Majid, S. (1991), "Examples of braided groups and braided matrices", J. Math. Phys. 32: 3246-3253, doi: 10.1063/1 .529485
[7] Majid, S. (1993), "Braided momentum in the q-Poincare group", J. Math. Phys. 34: 2045-2058, doi:10.1063/1.530154
[8] 't Hooft, G. (1996), "Quantization of point particles in (2 + 1) -dimensional gravity and spacetime discreteness", Classical and Quantum
Gravity 13: 1023-1039, doi: 10.1088/0264-9381/13/5/018
[9] Batista, E.; Majid, S. (2003), "Noncommutative geometry of angular momentum space U(su_2)", J. Math. Phys. 44: 107-137,
doi: 10. 1063/1. 1517395
[10] Snyder, H. (1947), "Quantized space-time", Phys. Rev. D 67: 38-41
[II] Doplicher, S.; Fredenhagen, K.; Roberts, J.E. (1995), "The quantum structure of spacetime at the Planck scale and quantum fields",
Commun. Math. Phys. 172: 187-220, doi:10.1007/BF02104515
[12] Seiberg, N.; Witten, E. (1999), "String theory and noncommutative geometry", JHEP: 9909;032
[13] Connes, A.; Lott, J. (1989), "Particle models and noncommutative geometry", Nucl. Phys. Proc. Suppl. B 18: 29,
doi:10.1016/0920-5632(91)90120-4
Further reading
• Majid, S. (1995), Foundations of Quantum Group Theory, Cambridge University Press
• D. Oriti, ed. (2009), Approaches to Quantum Gravity, Cambridge University Press
• Connes, A.; Marcolli, M. (2007), Noncommutative Geometry, Quantum Fields and Motives, Colloquium
Publications
• Majid, S.; Schroers, BJ. (2009), " q -Deformation and semidualization in 3D quantum gravity", /. Phys. A:
Math. Theor. 42: 425402 (40pp), doi: 10.1088/1751-81 13/42/42/425402
• R. P. Grimaldi, Discrete and Combinatorial Mathematics: An Applied Introduction, 4th Ed. Addison- Wesley
1999.
• J. Matousek, J. Nesetril, Invitation to Discrete Mathematics. Oxford University Press 1998.
• Taylor E. F., John A. Wheeler, Spacetime Physics, publisher W. H. Freeman, 1963.
External links
• Plus Magazine article on quantum geometry (http://plus.maths.org/issue43/features/noncom/index-gifd.
html) by Marianne Freiberger
• S. Majid, ed. (2008), On Space and Time, Cambridge University Press (http://www.ewidgetsonline.com/
dxreader/Reader.aspx?token=KSvQC0T+8xw7PY+iwDj29A==&rand=1970407385&buyNowLink=http://
www.cambridge.org/us/catalogue/ AddToBasket.asp?isbn=9780521889261)
Quantum gauge theory
230
Quantum gauge theory
In order to quantize a gauge theory, like for example Yang-Mills theory, Chern-Simons or BF model, one method is
to perform a gauge fixing. This is done in the BRST and Batalin-Vilkovisky formulation. Another is to factor out the
symmetry by dispensing with vector potentials altogether (they're not physically observable anyway) and work
directly with Wilson loops, Wilson lines contracted with other charged fields at its endpoints and spin networks.
Older approaches to quantization for Abelian models use the Gupta-Bleuler formalism with a "semi-Hilbert space"
with an indefinite sesquilinear form. However, it is much more elegant to just work with the quotient space of vector
field configurations by gauge transformations.
An alternative approach using lattice approximations is covered in (Wick rotated) lattice gauge theory.
To establish the existence of the Yang-Mills theory and a mass gap is one of the seven Millennium Prize Problems of
the Clay Mathematics Institute.
Standard Model
Three Generations
of Matter (Fermions)
mass-*
charge-*
spin-*
name-*
in
J*
fD
=3
o
The standard model of particle physics is a
theory concerning the electromagnetic,
weak, and strong nuclear interactions, which
mediate the dynamics of the known
subatomic particles. Developed throughout
the early and middle 20th century, the
current formulation was finalized in the mid
1970s upon experimental confirmation of
the existence of quarks. Since then,
discoveries of the bottom quark (1977), the
top quark (1995) and the tau neutrino (2000)
have given credence to the standard model.
Because of its success in explaining a wide
variety of experimental results, the standard
model is sometimes regarded as a theory of
almost everything.
Still, the standard model falls short of being
a complete theory of fundamental
interactions because it does not incorporate
the physics of general relativity, such as
gravitation and dark energy. The theory
does not contain any viable dark matter
particle that possesses all of the required properties deduced from observational cosmology. It also does not correctly
account for neutrino oscillations (and their non-zero masses). Although the standard model is theoretically
self-consistent, it has several unnatural properties giving rise to puzzles like the strong CP problem and the hierarchy
problem.
Nevertheless, the standard model is important to theoretical and experimental particle physicists alike. For
theoreticians, the standard model is a paradigm example of a quantum field theory, which exhibits a wide range of
physics including spontaneous symmetry breaking, anomalies, non-perturbative behavior, etc. It is used as a basis for
in
c
o
-I— '
Q_
2.4 MeV
up
1.27 GeV
charm
171.2 GeV
top
0
l Y
photon
4.8 MeV
::6
down
104 MeV
strange
4.2 GeV
bottom
0
i g
gluon
<2.2 eV
electron
neutrino
<0.17 MeV
IV v
muon
neutrino
<15.5 MeV
tau
neutrino
91.2 GeV q
: z
weak
force
0.511 MeV
1 p
electron
105.7 MeV
* |i
muon
1.777 GeV
* T
tau
80.4 GeV
"W
weak
force
in
CD
u
in
c
o
in
O
The Standard Model of elementary particles, with the gauge bosons in the
rightmost column.
Standard Model
231
building more exotic models which incorporate hypothetical particles, extra dimensions and elaborate symmetries
(such as supersymmetry) in an attempt to explain experimental results at variance with the standard model such as
the existence of dark matter and neutrino oscillations. In turn, the experimenters have incorporated the standard
model into simulators to help search for new physics beyond the standard model from relatively uninteresting
background.
Recently, the standard model has found applications in other fields besides particle physics such as astrophysics and
cosmology, in addition to nuclear physics.
Historical background
The first step towards the Standard Model was Sheldon Glashow's discovery, in 1960, of a way to combine the
electromagnetic and weak interactions J 1] In 1967, Steven Weinberg 1 ^ and Abdus Salam 1 ^ incorporated the Higgs
mechanism^ ^ ^ into Glashow's electroweak theory, giving it its modern form.
The Higgs mechanism is believed to give rise to the masses of all the elementary particles in the Standard Model.
This includes the masses of the W and Z bosons, and the masses of the fermions - i.e. the quarks and leptons.
After the neutral weak currents caused by Z boson exchange were discovered at CERN in 1973, [7] [8] [9] [10] the
electroweak theory became widely accepted and Glashow, Salam, and Weinberg shared the 1979 Nobel Prize in
Physics for discovering it. The W and Z bosons were discovered experimentally in 1981, and their masses were
found to be as the Standard Model predicted.
The theory of the strong interaction, to which many contributed, acquired its modern form around 1973-74, when
experiments confirmed that the hadrons were composed of fractionally charged quarks.
Overview
At present, matter and energy are best understood in terms of the kinematics and interactions of elementary particles.
To date, physics has reduced the laws governing the behavior and interaction of all known forms of matter and
energy to a small set of fundamental laws and theories. A major goal of physics is to find the "common ground" that
would unite all of these theories into one integrated theory of everything, of which all the other known laws would
be special cases, and from which the behavior of all matter and energy could be derived (at least in principle).^ 1 ^
The Standard Model groups two major extant theories — quantum electroweak and quantum chromodynamics — into
an internally consistent theory that describes the interactions between all known particles in terms of quantum field
theory. For a technical description of the fields and their interactions, see Standard Model (mathematical
formulation).
Particle content
Fermions
Standard Model
232
Organization of Fermions
Charge
First generation
Second generation
Third generation
Quarks
+%
Up
u
Charm
c
Top
t
-X
Down
d
Strange
s
Bottom
b
Leptons
-1
Electron
e~
Muon
|T
Tau
x~
0
Electron neutrino
V
e
Muon neutrino
V
Tau neutrino
V
X
The Standard Model includes 12 elementary particles of spin- known as fermions. According to the spin- statistics
theorem, fermions respect the Pauli exclusion principle. Each fermion has a corresponding antiparticle.
The fermions of the Standard Model are classified according to how they interact (or equivalently, by what charges
they carry). There are six quarks (up, down, charm, strange, top, bottom), and six leptons (electron, electron
neutrino, muon, muon neutrino, tau, tau neutrino). Pairs from each classification are grouped together to form a
generation, with corresponding particles exhibiting similar physical behavior (see table).
The defining property of the quarks is that they carry color charge, and hence, interact via the strong interaction. A
phenomenon called color confinement results in quarks being perpetually (or at least since very soon after the start of
the Big Bang) bound to one another, forming color-neutral composite particles (hadrons) containing either a quark
and an antiquark (mesons) or three quarks (baryons). The familiar proton and the neutron are the two baryons having
the smallest mass. Quarks also carry electric charge and weak isospin. Hence they interact with other fermions both
electromagnetically and via the weak nuclear interaction.
The remaining six fermions do not carry color charge and are called leptons. The three neutrinos do not carry electric
charge either, so their motion is directly influenced only by the weak nuclear force, which makes them notoriously
difficult to detect. However, by virtue of carrying an electric charge, the electron, muon, and tau all interact
electromagnetically.
Each member of a generation has greater mass than the corresponding particles of lower generations. The first
generation charged particles do not decay; hence all ordinary (baryonic) matter is made of such particles.
Specifically, all atoms consist of electrons orbiting atomic nuclei ultimately constituted of up and down quarks.
Second and third generations charged particles, on the other hand, decay with very short half lives, and are observed
only in very high-energy environments. Neutrinos of all generations also do not decay, and pervade the universe, but
rarely interact with baryonic matter.
Standard Model
233
Gauge bosons
In the Standard Model, gauge bosons
are force carriers that mediate the
strong, weak, and electromagnetic
fundamental interactions.
Interactions in physics are the ways
that particles influence other particles.
At a macroscopic level,
electromagnetism allows particles to
interact with one another via electric
and magnetic fields, and gravitation
allows particles with mass to attract
one another in accordance with
Einstein's general relativity. The
standard model explains such forces as
resulting from matter particles
exchanging other particles, known as
force mediating particles (Strictly speaking, this is only so if interpreting literally what is actually an approximation
method known as perturbation theory, as opposed to the exact theory). When a force mediating particle is exchanged,
at a macroscopic level the effect is equivalent to a force influencing both of them, and the particle is therefore said to
have mediated (i.e., been the agent of) that force. The Feynman diagram calculations, which are a graphical form of
the perturbation theory approximation, invoke "force mediating particles" and when applied to analyze high-energy
scattering experiments are in reasonable agreement with the data. Perturbation theory (and with it the concept of
"force mediating particle") in other situations fails. These include low-energy QCD, bound states, and solitons.
The gauge bosons of the Standard Model also all have spin (as do matter particles), but in their case, the value of the
spin is 1, making them bosons. As a result, they do not follow the Pauli exclusion principle. The different types of
gauge bosons are described below.
• Photons mediate the electromagnetic force between electrically charged particles. The photon is massless and is
well-described by the theory of quantum electrodynamics.
• The W + , W~, and Z gauge bosons mediate the weak interactions between particles of different flavors (all quarks
and leptons). They are massive, with the Z being more massive than the W ± . The weak interactions involving the
W ± act on exclusively left-handed particles and right-handed antiparticles. Furthermore, the W ± carry an electric
charge of +1 and -1 and couple to the electromagnetic interactions. The electrically neutral Z boson interacts with
both left-handed particles and antiparticles. These three gauge bosons along with the photons are grouped together
which collectively mediate the electro weak interactions.
• The eight gluons mediate the strong interactions between color charged particles (the quarks). Gluons are
massless. The eightfold multiplicity of gluons is labeled by a combination of color and an anticolor charge (e.g.,
ri2i
red-antigreen). Because the gluon has an effective color charge, they can interact among themselves. The
gluons and their interactions are described by the theory of quantum chromodynamics.
The interactions between all the particles described by the Standard Model are summarized by the diagram at the top
of this section.
Standard Model
234
Higgs boson
The Higgs particle is a hypothetical massive scalar elementary particle theorized by Robert Brout, Francois Englert,
Peter Higgs, Gerald Guralnik, C. R. Hagen, and Tom Kibble in 1964 (see 1964 PRL symmetry breaking papers) and
is a key building block in the Standard Model J 13 ^ ^ ^ ^ It has no intrinsic spin, and for that reason is classified
as a boson (like the gauge bosons, which have integer spin). Because an exceptionally large amount of energy and
beam luminosity are theoretically required to observe a Higgs boson in high energy colliders, it is the only
fundamental particle predicted by the Standard Model that has yet to be observed.
The Higgs boson plays a unique role in the Standard Model, by explaining why the other elementary particles, the
photon and gluon excepted, are massive. In particular, the Higgs boson would explain why the photon has no mass,
while the W and Z bosons are very heavy. Elementary particle masses, and the differences between
electromagnetism (mediated by the photon) and the weak force (mediated by the W and Z bosons), are critical to
many aspects of the structure of microscopic (and hence macroscopic) matter. In electroweak theory, the Higgs
boson generates the masses of the leptons (electron, muon, and tau) and quarks.
As yet, no experiment has directly detected the existence of the Higgs boson. It is hoped that the Large Hadron
Collider at CERN will confirm the existence of this particle. It is also possible that the Higgs boson may already
ri7i
have been produced but overlooked.
Field content
The standard model has the following fields:
Spinl
1. A U(l) gauge field B^ with coupling g' (weak U(l), or weak hypercharge)
2. An SU(2) gauge field W with coupling g (weak SU(2), or weak isospin)
3. An SU(3) gauge field with coupling g^ (strong SU(3), or color charge)
Spin V 2
The spin V 2 particles are in representations of the gauge groups. For the U(l) group, we list the value of the weak
hypercharge instead. The left-handed fermionic fields are:
1. An SU(3) triplet, SU(2) doublet, with U(l) weak hypercharge V (left-handed quarks)
2. An SU(3) triplet, SU(2) singlet, with U(l) weak hypercharge / (left-handed down-type antiquark)
3. An SU(3) singlet, SU(2) doublet with U(l) weak hypercharge -1 (left-handed lepton)
4. An SU(3) triplet, SU(2) singlet, with U(l) weak hypercharge - 4 / 3 (left-handed up-type antiquark)
5. An SU(3) singlet, SU(2) singlet with U(l) weak hypercharge 2 (left-handed antilepton)
By CPT symmetry, there is a set of right-handed fermions with the opposite quantum numbers.
This describes one generation of leptons and quarks, and there are three generations, so there are three copies of each
field. Note that there are twice as many left-handed lepton field components as left-handed antilepton field
components in each generation, but an equal number of left-handed quark and antiquark fields.
Standard Model
235
SpinO
1. An SU(2) doublet H with U(l) hyper-charge -1 (Higgs field)
2
Note that \H\ , summed over the two SU(2) components, is invariant under both SU(2) and under U(l), and so it can
appear as a renormalizable term in the Lagrangian, as can its square.
This field acquires a vacuum expectation value, leaving a combination of the weak isospin, and weak hypercharge
unbroken. This is the electromagnetic gauge group, and the photon remains massless. The standard formula for the
electric charge (which defines the normalization of the weak hypercharge, Y, which would otherwise be somewhat
arbitrary) is:^
Y
Q = h + T
Lagrangian
The Lagrangian for the spin 1 and spin V fields is the most general renormalizable gauge field Lagrangian with no
fine tunings:
• Spin 1:
where the traces are over the SU(2) and SU(3) indices hidden in W and G respectively. The two-index objects are the
field strengths derived from W and G the vector fields. There are also two extra hidden parameters: the theta angles
for SU(2) and SU(3).
The spin-V 2 particles can have no mass terms because there is no right/left helicity pair with the same SU(2) and
SU(3) representation and the same weak hypercharge. This means that if the gauge charges were conserved in the
vacuum, none of the spin V particles could ever swap helicity, and they would all be massless.
For a neutral fermion, for example a hypothetical right-handed lepton N (or N 01 in relativistic two-spinor notation),
with no SU(3), SU(2) representation and zero charge, it is possible to add the term:
J MN a N^e af} + N A Nfit&.
This term gives the neutral fermion a Majorana mass. Since the generic value for M will be of order 1, such a particle
would generically be unacceptably heavy. The interactions are completely determined by the theory - the leptons
introduce no extra parameters.
Higgs mechanism
The Lagrangian for the Higgs includes the most general renormalizable self interaction:
5 Higgs = f d*x [(D li H)*(D' l H) + \{\H\ 2 - v 2 f] ■
2
The parameter v has dimensions of mass squared, and it gives the location where the classical Lagrangian is at a
2
minimum. In order for the Higgs mechanism to work, v must be a positive number, v has units of mass, and it is the
only parameter in the standard model which is not dimensionless. It is also much smaller than the Planck scale; it is
approximately equal to the Higgs mass, and sets the scale for the mass of everything else. This is the only real
fine-tuning to a small nonzero value in the standard model, and it is called the Hierarchy problem.
It is traditional to choose the SU(2) gauge so that the Higgs doublet in the vacuum has expectation value (v,0).
Standard Model
236
Masses and CKM matrix
The rest of the interactions are the most general spin-0 spin- 1 /^ Yukawa interactions, and there are many of these.
These constitute most of the free parameters in the model. The Yukawa couplings generate the masses and mixings
once the Higgs gets its vacuum expectation value.
The terms L HR generate a mass term for each of the three generations of leptons. There are 9 of these terms, but by
relabeling L and R, the matrix can be diagonalized. Since only the upper component of H is nonzero, the upper
SU(2) component of L mixes with R to make the electron, the muon, and the tau, leaving over a lower massless
component, the neutrino. {Neutrino oscillation show neutrinos have mass, http://operaweb.lngs.infn.it/spip.
php?rubriquel4 31May2010 Press Release.}
The terms QHU generate up masses, while QHD generate down masses. But since there is more than one
right-handed singlet in each generation, it is not possible to diagonalize both with a good basis for the fields, and
there is an extra CKM matrix.
Theoretical aspects
Construction of the Standard Model Lagrangian
Parameters of the Standard Model
Symbol
Description
Renormalization
scheme (point)
Value
m
e
Electron mass
511 keV
m
\i
Muon mass
105.7 MeV
m
X
Tau mass
1.78 GeV
m
u
Up quark mass
"MS = 2GeV
1.9 MeV
m
d
Down quark mass
"MS = 2GeV
4.4 MeV
m
s
Strange quark mass
"MS = 2GeV
87 MeV
m
c
Charm quark mass
"MS = m c
1.32 GeV
m,
b
Bottom quark mass
"MS = m b
4.24 GeV
m
t
Top quark mass
On-shell scheme
172.7 GeV
9 n
CKM 12-mixing angle
13.1°
9 23
CKM 23 -mixing angle
2.4°
°n
CKM 13-mixing angle
0.2°
6
CKM CP-violating Phase
0.995
h
U(l) gauge coupling
"MS = m z
0.357
8 2
SU(2) gauge coupling
"MS = m z
0.652
8 3
SU(3) gauge coupling
"MS = m z
1.221
e
QCD
QCD vacuum angle
~0
n
Higgs quadratic coupling
Unknown
Higgs self-coupling strength
Unknown
Technically, quantum field theory provides the mathematical framework for the standard model, in which a
Lagrangian controls the dynamics and kinematics of the theory. Each kind of particle is described in terms of a
Standard Model
237
dynamical field that pervades space-time. The construction of the standard model proceeds following the modern
method of constructing most field theories: by first postulating a set of symmetries of the system, and then by writing
down the most general renormalizable Lagrangian from its particle (field) content that observes these symmetries.
The global Poincare symmetry is postulated for all relativistic quantum field theories. It consists of the familiar
translational symmetry, rotational symmetry and the inertial reference frame in variance central to the theory of
special relativity. The local SU(3)xSU(2)xU(l) gauge symmetry is an internal symmetry that essentially defines the
standard model. Roughly, the three factors of the gauge symmetry give rise to the three fundamental interactions.
The fields fall into different representations of the various symmetry groups of the Standard Model (see table). Upon
writing the most general Lagrangian, one finds that the dynamics depend on 19 parameters, whose numerical values
are established by experiment. The parameters are summarized in the table at right.
The QCD sector
The QCD sector defines the interactions between quarks and gluons, with SU(3) symmetry, generated by T a . Since
leptons do not interact with gluons, they are not affected by this sector.
C-qcd = U(d» - ig s G°T a )rU + D{d» - ig^^-fD.
is the gluon field strength, 7^ are the Dirac matrices, D stands for the isospin doublet section, U stands for a
unitary matrix, and g § is the strong coupling constant.
The electroweak sector
The electroweak sector is a Yang-Mills gauge theory with the symmetry group U(l)xSU(2) L ,
1/1 ^ '
where is the U(l) gauge field; is the weak hypercharge — the generator of the U(l) group; W^is the
three-component SU(2) gauge field; 7^ are the Pauli matrices — infinitesimal generators of the SU(2) group. The
subscript L indicates that they only act on left fermions; g' and g are coupling constants.
The Higgs sector
In the Standard Model, the Higgs field is a complex spinor of the group SU(2) L :
where the indexes + and 0 indicate the electric charge (Q) of the components. The weak isospin (Y^) of both
components is 1.
Before symmetry breaking, the Higgs Lagrangian is:
A 2
£h = (d, - \ (g'YwB, + grW,) ){%+\ (<7%^ + 9^) ) <P ~ T faV - ,
which can also be written as:
d^ + ^g'YwB^ + grW^tp
v2
X 2
Standard Model
238
Additional symmetries of the Standard Model
From the theoretical point of view, the Standard Model exhibits four additional global symmetries, not postulated at
the outset of its construction, collectively denoted accidental symmetries, which are continuous U(l) global
symmetries. The transformations leaving the Lagrangian invariant are:
V q (z) -> e-/ 3 ^
E L -> e if, E L and (e R ) c -> e ip (e R ) c
M L -> M L and -> e^(^) c
T L -> e^T L and (t*) c -> e^(r fl ) c .
The first transformation rule is shorthand meaning that all quark fields for all generations must be rotated by an
identical phase simultaneously. The fields , and (fJ,R) c , (r^) c are the 2nd (muon) and 3rd (tau)
generation analogs of i^and (e^) c fields.
By Noether's theorem, each symmetry above has an associated conservation law: the conservation of baryon number,
electron number, muon number, and tau number. Each quark is assigned a baryon number of 1/3, while each
antiquark is assigned a baryon number of -1/3. Conservation of baryon number implies that the number of quarks
minus the number of antiquarks is a constant. Within experimental limits, no violation of this conservation law has
been found.
Similarly, each electron and its associated neutrino is assigned an electron number of +1, while the antielectron and
the associated antineutrino carry -1 electron number. Similarly, the muons and their neutrinos are assigned a muon
number of +1 and the tau leptons are assigned a tau lepton number of +1. The Standard Model predicts that each of
these three numbers should be conserved separately in a manner similar to the way baryon number is conserved.
These numbers are collectively known as lepton family numbers (LF). Symmetry works differently for quarks than
for leptons, mainly because the Standard Model predicts that neutrinos are massless. However, it was recently found
that neutrinos have small masses and oscillate between flavors, signaling that the conservation of lepton family
number is violated.
In addition to the accidental (but exact) symmetries described above, the Standard Model exhibits several
approximate symmetries. These are the M SU(2) custodial symmetry" and the M SU(2) or SU(3) quark flavor
symmetry."
Symmetries of the Standard Model and Associated Conservation Laws
Symmetry
Lie Group
Symmetry Type
Conservation Law
Poincare
TranslationsxSO(3,l)
Global symmetry
Energy, Momentum, Angular momentum
Gauge
SU(3)xSU(2)xU(l)
Local symmetry
Color charge, Weak isospin, Electric charge, Weak hypercharge
Baryon phase
U(l)
Accidental Global symmetry
Baryon number
Electron phase
U(l)
Accidental Global symmetry
Electron number
Muon phase
U(l)
Accidental Global symmetry
Muon number
Tau phase
U(l)
Accidental Global symmetry
Tau number
Standard Model
239
Field content of the Standard Model
Field
(1st generation)
Spin
Gauge group
Representation
Bar yon
Number
Electron
Number
Left-handed quark
Ql
1/2
(3, 2, +1/3)
1/3
0
Left-handed up antiquark
Ut = (uviY
""Li — V it /
1/2
(3, 1, -4/3)
-1/3
0
Left-handed down antiquark
1/2
(3, 1.+2/3)
-1/3
0
Left-handed lepton
1/2
(1,2, -1)
0
1
Left-handed antielectron
§l = (e R ) c
1/2
(1, l,+2)
0
-1
Hypercharge gauge field
1
( 1, 1, 0)
0
0
Isospin gauge field
w.
1
(1,3,0)
0
0
Gluon field
1
(8, 1, 0)
0
0
Higgs field
H
0
(1,2,+1)
0
0
List of standard model fermions
This table is based in part on data gathered by the Particle Data Group J 19 ^
Left-handed fermions in the Standard Model
Generation 1
Fermion
(left-handed)
Symbol
Electric
charge
Weak
isospin
Weak
hypercharge
Color
charge *
Mass **
Electron
e~
-1
-1/2
-1
1
511 keV
Positron
e +
+ 1
0
+2
1
511 keV
Electron neutrino
v e
0
+1/2
-1
1
< 2 eV ****
Electron antineutrino
i> e
0
0
0
1
< 2 eV ****
Up quark
u
+2/3
+1/2
+1/3
3
~ 3 MeV ***
Up antiquark
u
-2/3
0
-4/3
3
~ 3 MeV ***
Down quark
d
-1/3
-1/2
+1/3
3
~6MeV***
Down antiquark
d
+1/3
0
+2/3
3
~6MeV***
Generation 2
Fermion
(left-handed)
Symbol
Electric
charge
Weak
isospin
Weak
hypercharge
Color
charge *
Mass **
Muon
-1
-1/2
-1
1
106 MeV
Antimuon
+ 1
0
+2
1
106 MeV
Muon neutrino
0
+1/2
-1
1
< 2 eV ****
Muon antineutrino
0
0
0
1
< 2 eV ****
Charm quark
c
+2/3
+1/2
+1/3
3
~ 1.337 GeV
Charm antiquark
c
-2/3
0
-4/3
3
~ 1.3 GeV
Strange quark
s
-1/3
-1/2
+1/3
3
~ 100 MeV
Strange antiquark
s
+1/3
0
+2/3
3
~ 100 MeV
Standard Model
240
Generation 3
Fermion
(left-handed)
Symbol
Electric
charge
Weak
isospin
Weak
hypercharge
Color
charge *
Mass **
Tau
T
-1
-1/2
-1
1
1.78 GeV
Antitau
7
+ 1
0
+2
1
1.78 GeV
Tau neutrino
V T
0
+1/2
-1
1
< 2 eV ****
Tau antineutrino
V T
0
0
0
1
< 2 eV ****
Top quark
t
+2/3
+1/2
+1/3
3
171 GeV
Top antiquark
I
-2/3
0
-4/3
3
171 GeV
Bottom quark
b
-1/3
-1/2
+1/3
3
~ 4.2 GeV
Bottom antiquark
b
+1/3
0
+2/3
3
~ 4.2 GeV
Notes:
• * These are not ordinary abelian charges, which can be added together, but are labels of group representations of Lie groups.
• ** Mass is really a coupling between a left-handed fermion and a right-handed fermion. For example, the mass of an electron is really a
coupling between a left-handed electron and a right-handed electron, which is the antiparticle of a left-handed positron. Also neutrinos show
large mixings in their mass coupling, so it's not accurate to talk about neutrino masses in the flavor basis or to suggest a left-handed electron
antineutrino.
• *** The masses of baryons and hadrons and various cross-sections are the experimentally measured quantities. Since quarks can't be isolated
because of QCD confinement, the quantity here is supposed to be the mass of the quark at the renormalization scale of the QCD scale.
• **** The Standard Model assumes that neutrinos are massless. However, several contemporary experiments prove that neutrinos oscillate
between their flavour states, which could not happen if all were massless. It is straightforward to extend the model to fit these data but there
are many possibilities, so the mass eigenstates are still open. See Neutrino#Mass.
Tests and predictions
The Standard Model (SM) predicted
the existence of the W and Z bosons,
gluon, and the top and charm quarks '
before these particles were observed.
Their predicted properties were
experimentally confirmed with good
precision. To give an idea of the
success of the SM, the following table compares the measured masses of the W and Z bosons with the masses
predicted by the SM:
|e
:
T
n-p n - TT
Li
□
n
1 j
D
lit
',0
tfl
<*>
Log plot of masses in the Standard Model.
Quantity
Measured (GeV)
SM prediction (GeV)
Mass of W boson
80.398 ± 0.025
80.390 ±0.018
Mass of Z boson
91.1876 ±0.0021
91.1874 ±0.0021
The SM also makes several predictions about the decay of Z bosons, which have been experimentally confirmed by
the Large Electron-Positron Collider at CERN.
Standard Model
241
Challenges to the standard model
There is some experimental evidence consistent with neutrinos having mass, which the Standard Model does not
allow. To accommodate such findings, the Standard Model can be modified by adding a non-renormalizable
interaction of lepton fields with the square of the Higgs field. This is natural in certain grand unified theories, and if
new physics appears at about 10 16 GeV, the neutrino masses are of the right order of magnitude.
Currently, there is one elementary particle predicted by the Standard Model that has yet to be observed: the Higgs
boson. A major reason for building the Large Hadron Collider is that the high energies of which it is capable are
expected to make the Higgs observable. However, as of August 2008, there is only indirect empirical evidence for
the existence of the Higgs boson, so that its discovery cannot be claimed. Moreover, there are serious theoretical
reasons for supposing that elementary scalar Higgs particles cannot exist (see Quantum triviality).
A fair amount of theoretical and experimental research has attempted to extend the Standard Model into a Unified
Field Theory or a Theory of everything, a complete theory explaining all physical phenomena including constants.
Inadequacies of the Standard Model that motivate such research include:
• It does not attempt to explain gravitation, and unlike for the strong and electroweak interactions of the Standard
Model, there is no known way of describing general relativity, the canonical theory of gravitation, consistently in
terms of quantum field theory. The reason for this is among other things that quantum field theories of gravity
generally break down before reaching the Planck scale. As a consequence, we have no reliable theory for the very
early universe;
• It seems rather ad-hoc and inelegant, requiring 19 numerical constants whose values are unrelated and arbitrary.
Although the Standard Model, as it now stands, can explain why neutrinos have masses, the specifics of neutrino
mass are still unclear. It is believed that explaining neutrino mass will require an additional 7 or 8 constants,
which are also arbitrary parameters;
• The Higgs mechanism gives rise to the hierarchy problem if any new physics (such as quantum gravity) is present
at high energy scales. In order for the weak scale to be much smaller than the Planck scale, severe fine tuning of
Standard Model parameters is required;
• It should be modified so as to be consistent with the emerging "standard model of cosmology." In particular, the
Standard Model cannot explain the observed amount of cold dark matter (CDM) and gives contributions to dark
energy which are far too large. It is also difficult to accommodate the observed predominance of matter over
antimatter (matter/antimatter asymmetry). The isotropy and homogeneity of the visible universe over large
distances seems to require a mechanism like cosmic inflation, which would also constitute an extension of the
Standard Model.
Currently no proposed Theory of everything has been conclusively verified.
Notes and references
Notes
[1] S.L. Glashow (1961). "Partial- symmetries of weak interactions". Nuclear Physics 22: 579-588. doi: 10. 1016/0029-5582(61)90469-2.
[2] S. Weinberg (1967). "A Model of Leptons". Physical Review Letters 19: 1264-1266. doi:10.1103/PhysRevLett.l9.1264.
[3] A. Salam (1968). N. Svartholm. ed. Elementary Particle Physics: Relativistic Groups and Analyticity. Eighth Nobel Symposium. Stockholm:
Almquvist and Wiksell. pp. 367.
[4] F. Englert, R. Brout (1964). "Broken Symmetry and the Mass of Gauge Vector Mesons". Physical Review Letters 13: 321-323.
doi: 1 0. 1 1 03/Phy sRevLett. 13.321.
[5] P.W. Higgs (1964). "Broken Symmetries and the Masses of Gauge Bosons". Physical Review Letters 13: 508-509.
doi: 1 0. 1 1 03/Phy sRevLett. 13.508.
[6] G.S. Guralnik, C.R. Hagen, T.W.B. Kibble (1964). "Global Conservation Laws and Massless Particles". Physical Review Letters 13:
585-587. doi:10.1103/PhysRevLett.l3.585.
[7] F.J. Hasert et al. (1973). "Search for elastic muon-neutrino electron scattering". Physics Letters B 46: 121.
doi: 10. 1016/0370-2693(73)90494-2.
Standard Model
242
[8] FJ. Hasert et al. (1973). "Observation of neutrino-like interactions without muon or electron in the gargamelle neutrino experiment". Physics
Letters B 46: 138. doi:10.1016/0370-2693(73)90499-l.
[9] F.J. Hasert et al. (1974). "Observation of neutrino-like interactions without muon or electron in the Gargamelle neutrino experiment". Nuclear
Physics B 73: 1. doi:10.1016/0550-3213(74)90038-8.
[10] D. Haidt (4 October 2004). "The discovery of the weak neutral currents" (http://cerncourier.com/cws/article/cern/29168). CERN
Courier. . Retrieved 2008-05-08.
[11] "Details can be worked out if the situation is simple enough for us to make an approximation, which is almost never, but often we can
understand more or less what is happening." from The Feynman Lectures on Physics, Vol 1. pp. 2-7
[12] Technically, there are nine such color-anticolor combinations. However there is one color symmetric combination that can be constructed
out of a linear superposition of the nine combinations, reducing the count to eight.
[13] F. Englert, R. Brout (1964). "Broken Symmetry and the Mass of Gauge Vector Mesons". Physical Review Letters 13: 321-323.
doi: 1 0. 1 1 03/Phy sRevLett. 13.321.
[14] P.W. Higgs (1964). "Broken Symmetries and the Masses of Gauge Bosons". Physical Review Letters 13: 508-509.
doi: 1 0. 1 1 03/Phy sRevLett. 13.508.
[15] G.S. Guralnik, C.R. Hagen, T.W.B. Kibble (1964). "Global Conservation Laws and Massless Particles". Physical Review Letters 13:
585-587. doi:10.1103/PhysRevLett.l3.585.
[16] G.S. Guralnik (2009). "The History of the Guralnik, Hagen and Kibble development of the Theory of Spontaneous Symmetry Breaking and
Gauge Particles". International Journal of Modern Physics A 24: 2601-2627. doi: 10. 1142/S021775 1X09045431. arXiv:0907.3466.
[17] A. Cho (23 January 2008). "Higgs Hiding in Plain Sight?" (http://sciencenow.sciencemag.Org/cgi/content/full/2008/123/3).
ScienceNOW. . Retrieved 2008-05-08.
[18] The normalization Q = I + Y is sometimes used instead.
[19] W.-M. Yao et al. (Particle Data Group) (2006). "Review of Particle Physics: Quarks" (http://pdg.lbl.gov/2006/tables/qxxx.pdf). Journal
of Physics G 33: 1. doi: 10. 1088/0954-3899/33/1/001. .
[20] W.-M. Yao et al. (Particle Data Group) (2006). "Review of Particle Physics: Neutrino mass, mixing, and flavor change" (http://pdg.lbl.
gov/2007/reviews/numixrpp.pdf). Journal of Physics G 33: 1. .
[21] http://press.web.cern.ch/press/PressReleases/Releases2010/PR08.10E.html
References
Further reading
• R. Oerter (2006). The Theory of Almost Everything: The Standard Model, the Unsung Triumph of Modern
Physics. Plume.
• B.A. Schumm (2004). Deep Down Things: The Breathtaking Beauty of Particle Physics. Johns Hopkins
University Press. ISBN 0-8018-7971-X.
• V. Stenger (2000). Timeless Reality. Prometheus Books. See chapters 9-12 in particular.
Introductory textbooks
• I. Aitchison, A. Hey (2003). Gauge Theories in Particle Physics: A Practical Introduction.. Institute of Physics.
ISBN 9780585445502.
• W. Greiner, B. Muller (2000). Gauge Theory of Weak Interactions. Springer. ISBN 3-540-67672-4.
• G.D. Coughlan, J.E. Dodd, B.M. Gripaios (2006). The Ideas of Particle Physics: An Introduction for Scientists.
Cambridge University Press.
• D.J. Griffiths (1987). Introduction to Elementary Particles. John Wiley & Sons. ISBN 0-471-60386-4.
• G.L. Kane (1987). Modern Elementary Particle Physics. Perseus Books. ISBN 0-201-1 1749-5.
Advanced textbooks
• T.P. Cheng, L.F. Li (2006). Gauge theory of elementary particle physics. Oxford University Press.
ISBN 0-19-851961-3. Highlights the gauge theory aspects of the Standard Model.
• J.F. Donoghue, E. Golowich, B.R. Holstein (1994). Dynamics of the Standard Model. Cambridge University
Press. ISBN 978-0521476522. Highlights dynamical and phenomenological aspects of the Standard Model.
• L. O'Raifeartaigh (1988). Group structure of gauge theories. Cambridge University Press. ISBN 0-521-34785-8.
Highlights group-theoretical aspects of the Standard Model.
Journal articles
Standard Model
243
• E.S. Abers, B.W. Lee (1973). "Gauge theories". Physics Reports 9: 1-141. doi: 10. 1016/0370-1573(73)90027-6.
• Y. Hayato et al (1999). "Search for Proton Decay through p —> vK + in a Large Water Cherenkov Detector".
Physical Review Letters 83: 1529. doi:10.1103/PhysRevLett.83.1529.
• S.F. Novaes (2000). "Standard Model: An Introduction". arXiv:hep-ph/0001283 [hep-ph].
• D.P. Roy (1999). "Basic Constituents of Matter and their Interactions — A Progress Report.".
arXiv:hep-ph/99 12523 [hep-ph].
• F. Wilczek (2004). "The Universe Is A Strange Place". arXiv:astro-ph/0401347 [astro-ph].
External links
• " Standard Model - explanation for beginners (http://cms.web.cern.ch/cms/Physics/StandardPackage/index.
html)" LHC
• " Standard Model may be found incomplete, (http://www.newscientist.com/news/news.jsp ?id=ns9999404)"
New Scientist.
• " Observation of the Top Quark (http://www-cdf.fnal.gov/top_status/top.html)" at Fermilab.
• " The Standard Model Lagrangian. (http://cosmicvariance.eom/2006/l 1/23/thanksgiving)" After electroweak
symmetry breaking, with no explicit Higgs boson.
• " Standard Model Lagrangian (http://nuclear.ucdavis.edu/~tgutierr/files/stmLl.html)" with explicit Higgs
terms. PDF, PostScript, and LaTeX versions.
• ' ' The particle adventure . (http ://particleadventure . org/) ' ' Web tutorial .
• Nobes, Matthew (2002) "Introduction to the Standard Model of Particle Physics" on Kuro5hin: Part 1, (http://
www.kuro5hin.org/story/2002/5/l/37 12/3 1700) Part 2, (http://www.kuro5hin.Org/story/2002/5/14/
19363/8142) Part 3a, (http://www.kuro5hin.Org/story/2002/7/15/173318/784) Part 3b. (http://www.
kuro5hin.org/story/2002/8/21/195035/576)
Topological quantum field theory
244
Topological quantum field theory
A topological quantum field theory (or topological field theory or TQFT) is a quantum field theory which
computes topological invariants.
Although TQFTs were invented by physicists, they are also of mathematical interest, being related to, among other
things, knot theory and the theory of four-manifolds in algebraic topology, and to the theory of moduli spaces in
algebraic geometry. Donaldson, Jones, Witten, and Kontsevich have all won Fields Medals for work related to
topological field theory.
In condensed matter physics, topological quantum field theories are the low energy effective theories of
topologically ordered states, such as fractional quantum Hall states, string-net condensed states, and other strongly
correlated quantum liquid states.
In a topological field theory, the correlation functions do not depend on the metric on spacetime. This means that the
theory is not sensitive to changes in the shape of spacetime; if the spacetime warps or contracts, the correlation
functions do not change. Consequently, they are topological invariants.
Topological field theories are not very interesting on the flat Minkowski spacetime used in particle physics.
Minkowski space can be contracted to a point, so a TQFT on Minkowski space computes only trivial topological
invariants. Consequently, TQFTs are usually studied on curved spacetimes, such as, for example, Riemann surfaces.
Most of the known topological field theories are defined on spacetimes of dimension less than five. It seems that a
few higher dimensional theories exist, but they are not very well understood.
Quantum gravity is believed to be background-independent (in some suitable sense), and TQFTs provide examples
of background independent quantum field theories. This has prompted ongoing theoretical investigation of this class
of models.
(Caveat: It is often said that TQFTs have only finitely many degrees of freedom. This is not a fundamental property.
It happens to be true in most of the examples that physicists and mathematicians study, but it is not necessary. A
topological sigma model with target infinite-dimensional projective space, if such a thing could be defined, would
have countably infinitely many degrees of freedom.)
Specific models
The known topological field theories fall into two general classes: Schwarz-type TQFTs and Witten-type TQFTs.
Witten TQFTs are also sometimes referred to as cohomological field theories.
Schwarz-type TQFTs
In Schwarz-type TQFTs, the correlation functions computed by the path integral are topological invariants because
the path integral measure and the quantum field observables are explicitly independent of the metric. For instance, in
the BF model, the spacetime is a two-dimensional manifold M, the observables are constructed from a two-form F,
an auxiliary scalar B, and their derivatives. The action (which determines the path integral) is
Jm
The spacetime metric does not appear anywhere in this theory, so the theory is explicitly topologically invariant.
Another, more famous example is Chern-Simons theory, which can be used to compute knot invariants.
Overview
Topological quantum field theory
245
Witten-type TQFTs
In Witten-type topological field theories, the topological in variance is more subtle. For example the Lagrangian for
the WZW model does depend explicitly on the metric, but one shows by calculation that the expectation value of the
partition function and a special class of correlation functions are in fact diffeomorphism invariant.
Mathematical formulations
Atiyah-Segal axioms
Atiyah suggested a set of axioms for topological quantum field theory which was inspired by Segal's proposed
axioms for conformal field theory, (Atiyah 1988). These axioms have been relatively useful for mathematical
treatments of Schwarz-type QFTs, although it isn't clear that they capture the whole structure of Witten-type QFTs.
The basic idea is that a TQFT is a functor from a certain category of cobordisms to the category of vector spaces.
There are in fact two different sets of axioms which could reasonably be called the Atiyah axioms. These axioms
differ basically in whether or not they study a TQFT defined on a single fixed ^-dimensional Riemannian /
Lorentzian spacetime Mora TQFT defined on all ^-dimensional spacetimes at once.
[ed. What follows is still in rough draft form and should be regarded suspiciously.]
The case of a fixed spacetime
Let BordM^e the category whose morphisms are ^-dimensional submanifolds of M and whose objects are
connected components of the boundaries of such submanifolds. Regard two morphisms as equivalent if they are
homotopic via submanifolds of M, and so form the quotient category hBordM : The objects in hB or dj^ are the
objects of Bordjtf' an d the morphisms of /^Bord/^ are homotopy equivalence classes of morphisms in BordM
A TQFT on M is a symmetric monoidal functor from HBordM^ 0 me category of vector spaces.
Note that cobordisms can, if their boundaries match up, be sewn together to form a new bordism. This is the
composition law for morphisms in the cobordism category. Since functors are required to preserve composition, this
says that the linear map corresponding to a sewn together morphism is just the composition of the linear map for
each piece.
There is an equivalence of categories between the category of 2-dimensional topological quantum field theories and
the category of commutative Frobenius algebras.
All n-dimensional spacetimes at once
To consider all spacetimes at once, it is necessary to replace
hBordM^y a larger category. So let Bord n be the category of
bordisms, i.e. the category whose morphisms are n-dimensional
manifolds with boundary, and whose objects are the connected
components of the boundaries of n-dimensional manifolds. (Note that
any (n — 1) -dimensional manifold may appear as an object in
Bord n .) As above, regard two morphisms in Bord n as equivalent
if they are homotopic, and form the quotient category hBord n .
Bord n is a monoidal category under the operation which takes two
bordisms to the bordism made from their disjoint union. A TQFT on
n-dimensional manifolds is then a functor from hBord n to the
category of vector spaces, which takes disjoint unions of bordisms to
the tensor product f [ed. unfinished]
The pair of pants is a (l+l)-dimensional bordism,
which corresponds to a product or coproduct in a
2-dimensional TQFT.
Topological quantum field theory
246
For example, for (l+l)-dimensional bordisms (2-dimensional bordisms between 1 -dimensional manifolds), the map
associated with a pair of pants gives a product or coproduct, depending on how the boundary components are
grouped - which is commutative or cocommutative, while the map associated with a disk gives a counit (trace) or
unit (scalars), depending on grouping of boundary, and thus (l+l)-dimension TQFTs correspond to Frobenius
algebras.
Generalizations
For some applications, it is convenient to demand extra topological structure on the morphisms, such as a choice of
orientation.
References
• Atiyah, Michael (1988), "Topological quantum field theories" Publications Mathematiques de VIHES 68 (68):
175-186, doi:10.1007/BF02698547, MR1001453
• Lurie, Jacob, On the Classification of Topological Field Theories
• Witten, Edward (1988), "Topological quantum field theory" , Communications in Mathematical Physics 111
(3): 353-386, doi:10.1007/BF0 1223 371, MR953828
Schwarz' original paper introducing the ideas of TQFT's, in which he produces a Ray-Singer invariant from a QFT
functional:
Mi
• Schwarz, Albert (1979), "The partition function of a degenerate functional" , Communications in Mathematical
Physics 67 (1): 1-16, doi:10.1007/BF01223197
References
[1] http://www.numdam.org/item?id=PMIHES_1988_68_175_0
[2] http://www-math.mit.edu/~lurie/papers/cobordism.pdf
[3] http://projecteuclid.Org/euclid.cmp/l 104161738
[4] http://www . springerlink. com/content/n5 w7 85042u 1 2 1 628/
Quantum Chromodynamics
247
Quantum Chromodynamics
In theoretical physics, quantum chromodynamics (QCD) is a theory of the strong interaction (color force), a
fundamental force describing the interactions of the quarks and gluons making up hadrons (such as the proton,
neutron or pion). It is the study of the SU(3) Yang-Mills theory of color-charged fermions (the quarks). QCD is a
quantum field theory of a special kind called a non-abelian gauge theory. It is an important part of the Standard
Model of particle physics. A huge body of experimental evidence for QCD has been gathered over the years.
QCD enjoys two peculiar properties:
• Confinement, which means that the force between quarks does not diminish as they are separated. Because of
this, it would take an infinite amount of energy to separate two quarks; they are forever bound into hadrons such
as the proton and the neutron. Although analytically unproven, confinement is widely believed to be true because
it explains the consistent failure of free quark searches, and it is easy to demonstrate in lattice QCD.
• Asymptotic freedom, which means that in very high-energy reactions, quarks and gluons interact very weakly.
This prediction of QCD was first discovered in the early 1970s by David Politzer and by Frank Wilczek and
David Gross. For this work they were awarded the 2004 Nobel Prize in Physics.
There is no known phase-transition line separating these two properties; confinement is dominant in low-energy
scales but, as energy increases, asymptotic freedom becomes dominant.
Terminology
The word quark was coined by American physicist Murray Gell-Mann (b. 1929) in its present sense. It originally
comes from the phrase "Three quarks for Muster Mark" in Finnegans Wake by James Joyce. On June 27, 1978,
Gell-Mann wrote a private letter to the editor of the Oxford English Dictionary, in which he related that he had been
influenced by Joyce's words: "The allusion to three quarks seemed perfect." (Originally, only three quarks had been
discovered.) Gell-Mann, however, wanted to pronounce the word with (6) not (a), as Joyce seemed to indicate by
rhyming words in the vicinity such as Mark. Gell-Mann got around that "by supposing that one ingredient of the line
'Three quarks for Muster Mark' was a cry of 'Three quarts for Mister . . . ' heard in H.C. Earwicker's pub," a plausible
suggestion given the complex punning in Joyce's novel. ^
The three kinds of charge in QCD (as opposed to one in quantum electrodynamics or QED) are usually referred to as
"color charge" by loose analogy to the three kinds of color (red, green and blue) perceived by humans. Other than
this "clever" nomenclature, the quantum parameter "color" is completely unrelated to the everyday, familiar
phenomenon of color.
Since the theory of electric charge is dubbed "electrodynamics", the Greek word "chroma" Xpcojia (meaning color)
is applied to the theory of color charge, "chromodynamics".
History
With the invention of bubble chambers and spark chambers in the 1950s, experimental particle physics discovered a
large and ever-growing number of particles called hadrons. It seemed that such a large number of particles could not
all be fundamental. First, the particles were classified by charge and isospin by Eugene Wigner and Werner
Heisenberg; then, in 1953, according to strangeness by Murray Gell-Mann and Kazuhiko Nishijima. To gain greater
insight, the hadrons were sorted into groups having similar properties and masses using the eightfold way, invented
in 1961 by Gell-Mann and Yuval Ne'eman. Gell-Mann and George Zweig, correcting an earlier approach of Shoichi
Sakata, went on to propose in 1963 that the structure of the groups could be explained by the existence of three
flavours of smaller particles inside the hadrons: the quarks.
Quantum Chromodynamics
248
Perhaps the first remark that quarks should possess an additional quantum number was made as a short footnote in
the preprint of Boris Struminsky 1 in connection with Q- hyperon composed of three strange quarks with parallel
spins (this situation was peculiar, because since quarks are fermions, such combination is forbidden by the Pauli
exclusion principle):
Three identical quarks cannot form an antisymmetric S-state. In order to realize an antisymmetric orbital
S -state, it is necessary for the quark to have an additional quantum number.
- B. V. Struminsky, Magnetic moments ofbarions in the quark model, JINR-Preprint P-1939, Dubna,
Submitted on January 7, 1965
Boris Struminsky was a PhD student of Nikolay Bogolyubov. The problem considered in this preprint was suggested
by Nikolay Bogolyubov, who advised Boris Struminsky in this research. In the beginning of 1965, Nikolay
Bogolyubov, Boris Struminsky and Albert Tavchelidze wrote a preprint with a more detailed discussion of the
Mi
additional quark quantum degree of freedom. This work was also presented by Albert Tavchelidze without
obtaining consent of his collaborators for doing so at an international conference in Trieste (Italy), in May 1965 ^
A similar mysterious situation was with the A ++ baryon; in the quark model, it is composed of three up quarks with
parallel spins. In 1965, Moo- Young Han with Yoichiro Nambu and Oscar W. Greenberg independently resolved the
problem by proposing that quarks possess an additional SU(3) gauge degree of freedom, later called color charge.
Han and Nambu noted that quarks might interact via an octet of vector gauge bosons: the gluons.
Since free quark searches consistently failed to turn up any evidence for the new particles, and because an
elementary particle back then was defined as a particle which could be separated and isolated, Gell-Mann often said
that quarks were merely convenient mathematical constructs, not real particles. The meaning of this statement was
usually clear in context: He meant quarks are confined, but he also was implying that the strong interactions could
probably not be fully described by quantum field theory.
Richard Feynman argued that high energy experiments showed quarks are real particles: he called them partons
(since they were parts of hadrons). By particles, Feynman meant objects which travel along paths, elementary
particles in a field theory.
The difference between Feynman's and Gell-Mann's approaches reflected a deep split in the theoretical physics
community. Feynman thought the quarks have a distribution of position or momentum, like any other particle, and
he (correctly) believed that the diffusion of parton momentum explained diffractive scattering. Although Gell-Mann
believed that certain quark charges could be localized, he was open to the possibility that the quarks themselves
could not be localized because space and time break down. This was the more radical approach of S-matrix theory.
James Bjorken proposed that pointlike partons would imply certain relations should hold in deep inelastic scattering
of electrons and protons, which were spectacularly verified in experiments at SLAC in 1969. This led physicists to
abandon the S-matrix approach for the strong interactions.
The discovery of asymptotic freedom in the strong interactions by David Gross, David Politzer and Frank Wilczek
allowed physicists to make precise predictions of the results of many high energy experiments using the quantum
field theory technique of perturbation theory. Evidence of gluons was discovered in three jet events at PETRA in
1979. These experiments became more and more precise, culminating in the verification of perturbative QCD at the
level of a few percent at the LEP in CERN.
The other side of asymptotic freedom is confinement. Since the force between color charges does not decrease with
distance, it is believed that quarks and gluons can never be liberated from hadrons. This aspect of the theory is
verified within lattice QCD computations, but is not mathematically proven. One of the Millennium Prize Problems
announced by the Clay Mathematics Institute requires a claimant to produce such a proof. Other aspects of
non-perturbative QCD are the exploration of phases of quark matter, including the quark-gluon plasma.
The relation between the short-distance particle limit and the confining long-distance limit is one of the topics
recently explored using string theory, the modern form of S-matrix theory ^
Quantum Chromodynamics
249
Theory
Some definitions
Every field theory of particle physics is based on certain symmetries of nature whose existence is deduced from
observations. These can be
• local symmetries, that is the symmetry acts independently at each point in space-time. Each such symmetry is the
basis of a gauge theory and requires the introduction of its own gauge bosons.
• global symmetries, which are symmetries whose operations must be simultaneously applied to all points of
space-time.
QCD is a gauge theory of the SU(3) gauge group obtained by taking the color charge to define a local symmetry.
Since the strong interaction does not discriminate between different flavors of quark, QCD has approximate flavor
symmetry, which is broken by the differing masses of the quarks.
There are additional global symmetries whose definitions require the notion of chirality, discrimination between left
and right-handed. If the spin of a particle has a positive projection on its direction of motion then it is called
left-handed; otherwise, it is right-handed. Chirality and handedness are not the same, but become approximately
equivalent at high energies.
• Chiral symmetries involve independent transformations of these two types of particle.
• Vector symmetries (also called diagonal symmetries) mean the same transformation is applied on the two
chiralities.
• Axial symmetries are those in which one transformation is applied on left-handed particles and the inverse on the
right-handed particles.
Additional remarks: duality
As mentioned, asymptotic freedom means that at large energy - this corresponds also to short distances - there is
practically no interaction between the particles. This is in contrast - more precisely one would say: dual - to what one
is used to, since usually one connects the absence of interactions with large distances. However, as already
mentioned in the original paper of Franz WegnerJ 9 ^ a solid state theorist who introduced 1971 simple gauge
invariant lattice models, the high-temperature behaviour of the original model, e.g. the strong decay of correlations
at large distances, corresponds to the low-temperature behaviour of the (usually ordered!) dual model, namely the
asymptotic decay of non-trivial correlations, e.g. short-range deviations from almost perfect arrangements, for short
distances. Here, in contrast to Wegner, we have only the dual model, which is that one described in this article/ 10 ^
Symmetry groups
The color group SU(3) corresponds to the local symmetry whose gauging gives rise to QCD. The electric charge
labels a representation of the local symmetry group U(l) which is gauged to give QED: this is an abelian group. If
one considers a version of QCD with N f flavors of massless quarks, then there is a global (chiral) flavor symmetry
group SU L (N f ) X SU R (N f ) X U B (1) X U A (1). The chiral symmetry is spontaneously broken by the QCD
vacuum to the vector (L+R) SUy (Nf) with the formation of a chiral condensate. The vector symmetry, Uq(1)
corresponds to the baryon number of quarks and is an exact symmetry. The axial symmetry U^(l)is exact in the
classical theory, but broken in the quantum theory, an occurrence called an anomaly. Gluon field configurations
called instantons are closely related to this anomaly.
There are two different types of SU(3) symmetry: there is the symmetry that acts on the different colors of quarks,
and this is an exact gauge symmetry mediated by the gluons, and there is also a flavor symmetry which rotates
different flavors of quarks to each other, or flavor SU(3). Flavor SU(3) is an approximate symmetry of the vacuum of
QCD, and is not a fundamental symmetry at all. It is an accidental consequence of the small mass of the three
Quantum Chromodynamics
250
lightest quarks.
In the QCD vacuum there are vacuum condensates of all the quarks whose mass is less than the QCD scale. This
includes the up and down quarks, and to a lesser extent the strange quark, but not any of the others. The vacuum is
symmetric under SU(2) isospin rotations of up and down, and to a lesser extent under rotations of up, down and
strange, or full flavor group SU(3), and the observed particles make isospin and SU(3) multiplets.
The approximate flavor symmetries do have associated gauge bosons, observed particles like the rho and the omega,
but these particles are nothing like the gluons and they are not massless. They are emergent gauge bosons in an
approximate string description of QCD.
Lagrangian
The dynamics of the quarks and gluons are controlled by the quantum chromodynamics Lagrangian. The gauge
invariant QCD Lagrangian is
£ QCD = * tyfiP^ - m <y 4, - Jg^gt
where ^(x) is the quark field, a dynamical function of space-time, in the fundamental representation of the SU(3)
gauge group, indexed by i, j 5 . . .; G^(:r)are the gluon fields, also a dynamical function of space-time, in the
adjoint representation of the SU(3) gauge group, indexed by a, 6, ... . The 7^ are Dirac matrices connecting the
spinor representation to the vector representation of the Lorentz group; and Tjj are the generators connecting the
fundamental, antifundamental and adjoint representations of the SU(3) gauge group. The Gell-Mann matrices
provide one such representation for the generators.
The symbol G^ v represents the gauge invariant gluonic field strength tensor, analogous to the electromagnetic field
strength tensor, , in Electrodynamics. It is given by
G% = d^G* - d v G; - gf^Gffi ,
where f abc are the structure constants of SU(3). Note that the rules to move-up or pull-down the a, b, or c indexes
are trivial, (+ +), so that f abc = f abc = f£ c } whereas for the fi or v indexes one has the non-trivial relativistic
rules, corresponding e.g. to the signature (+ — ). Furthermore, for mathematicians, according to this formula the
gluon colour field can be represented by a SU(3)-Lie algebra-valued " curvature M -2-form G = dG — g G A G ,
where G* s a " vector potential"- 1 -form corresponding to G an d A is the (antisymmetric) "wedge product" of this
algebra, producing the "structure constants" f abc . The Cartan-derivative of the field form (i.e. essentially the
divergence of the field) would be zero in the absence of the "gluon terms", i.e. those ~ g, which represent the
*Kie cons^fanfs^m ari§ °<? contrH^me quark mass and coupling constants of the theory, subject to renormalization in
the full quantum theory.
An important theoretical notion concerning the final term of the above Lagrangian is the Wilson loop variable. This
loop variable plays a most-important role in discretized forms of the QCD (see lattice QCD), and more generally, it
distinguishes confined and deconfined states of a gauge theory. It was introduced by the Nobel prize winner Kenneth
G. Wilson and is treated in a separate article.
Quantum Chromodynamics
251
Fields
Quarks are massive spin- 1/2 fermions which carry a color charge whose gauging is the content of QCD. Quarks are
represented by Dirac fields in the fundamental representation 3 of the gauge group SU(3). They also carry electric
charge (either -1/3 or 2/3) and participate in weak interactions as part of weak isospin doublets. They carry global
quantum numbers including the baryon number, which is 1/3 for each quark, hypercharge and one of the flavor
quantum numbers.
Gluons are spin-1 bosons which also carry color charges, since they lie in the adjoint representation 8 of SU(3). They
have no electric charge, do not participate in the weak interactions, and have no flavor. They lie in the singlet
representation 1 of all these symmetry groups.
Every quark has its own antiquark. The charge of each antiquark is exactly the opposite of the corresponding quark.
Dynamics
According to the rules of quantum field theory, and the associated Feynman diagrams, the above theory gives rise to
three basic interactions: a quark may emit (or absorb) a gluon, a gluon may emit (or absorb) a gluon, and two gluons
may directly interact. This contrasts with QED, in which only the first kind of interaction occurs, since photons have
no charge. Diagrams involving Faddeev-Popov ghosts must be considered too.
Area law and confinement
Detailed computations with the above-mentioned Lagrangian^ 11 ^ show that the effective potential between a quark
and its anti-quark in a meson contains a term OC r , which represents some kind of "stiffness" of the interaction
between the particle and its anti-particle at large distances, similar to the entropic elasticity of a rubber band (see
below). This leads to confinement of the quarks to the interiour of hadrons, i.e. mesons and nucleons, with typical
ri3i
radii R , corresponding to former "Bag models" of the hadrons . The order of magnitude of the "bag radius" is 1
C -15
fm (=10 m). Moreover, the above-mentioned stiffness is quantitatively related to the so-called "area law"
behaviour of the expectation value of the Wilson loop product Pyyof the ordered coupling constants around a
closed loop W; i.e. (Pw) * s proportional to the area enclosed by the loop. For this behaviour the non-abelian
behaviour of the gauge group is essential.
Methods
Further analysis of the content of the theory is complicated. Various techniques have been developed to work with
QCD. Some of them are discussed briefly below.
Perturbative QCD
This approach is based on asymptotic freedom, which allows perturbation theory to be used accurately in
experiments performed at very high energies. Although limited in scope, this approach has resulted in the most
precise tests of QCD to date.
Lattice QCD
Among non-perturbative approaches to QCD, the most well established one is lattice QCD. This approach uses a
discrete set of space-time points (called the lattice) to reduce the analytically intractable path integrals of the
continuum theory to a very difficult numerical computation which is then carried out on supercomputers like the
QCDOC which was constructed for precisely this purpose. While it is a slow and resource-intensive approach, it has
wide applicability, giving insight into parts of the theory inaccessible by other means. However, the numerical sign
problem makes it difficult to use lattice methods to study QCD at high density and low temperature (e.g. nuclear
matter or the interior of neutron stars).
Quantum Chromodynamics
252
1/N expansion
A well-known approximation scheme, the 1/N expansion, starts from the premise that the number of colors is
infinite, and makes a series of corrections to account for the fact that it is not. Until now it has been the source of
qualitative insight rather than a method for quantitative predictions. Modern variants include the AdS/CFT approach.
Effective theories
For specific problems effective theories may be written down which give qualitatively correct results in certain
limits. In the best of cases, these may then be obtained as systematic expansions in some parameter of the QCD
Lagrangian. One such effective field theory is chiral perturbation theory or ChiPT, which is the QCD effective
theory at low energies. More precisely, it is a low energy expansion based on the spontaneus chiral symmetry
breaking of QCD, which is an exact symmetry when quark masses are equal to zero, but for the u,d and s quark,
which have small mass, it is still a good approximate symmetry. Depending on the number of quarks which are
treated as light, one uses either SU(2) ChiPT or SU(3) ChiPT . Other effective theories are heavy quark effective
theory (which expands around heavy quark mass near infinity), and soft-collinear effective theory (which expands
around large ratios of energy scales). In addition to effective theories, models like the Nambu-Jona-Lasinio model
and the chiral model are often used when discussing general features.
QCD Sum Rules
Based on an Operator product expansion one can derive sets of relations that connect different observables with each
other.
Experimental tests
The notion of quark flavours was prompted by the necessity of explaining the properties of hadrons during the
development of the quark model. The notion of colour was necessitated by the puzzle of the A ++ . This has been dealt
with in the section on the history of QCD.
The first evidence for quarks as real constituent elements of hadrons was obtained in deep inelastic scattering
experiments at SLAC. The first evidence for gluons came in three jet events at PETRA.
Good quantitative tests of perturbative QCD are
• the running of the QCD coupling as deduced from many observations
• scaling violation in polarized and unpolarized deep inelastic scattering
• vector boson production at colliders (this includes the Drell-Yan process)
• jet cross sections in colliders
• event shape observables at the LEP
• heavy-quark production in colliders
Quantitative tests of non-perturbative QCD are fewer, because the predictions are harder to make. The best is
probably the running of the QCD coupling as probed through lattice computations of heavy-quarkonium spectra.
There is a recent claim about the mass of the heavy meson B c [14]. Other non-perturbative tests are currently at the
level of 5% at best. Continuing work on masses and form factors of hadrons and their weak matrix elements are
promising candidates for future quantitative tests. The whole subject of quark matter and the quark-gluon plasma is a
non-perturbative test bed for QCD which still remains to be properly exploited.
Quantum Chromodynamics
253
Cross-relations to Solid State Physics
There are unexpected cross-relations to solid state physics. For example, the notion of gauge invariance forms the
basis of the well-known Mattis spin glasses J 15 ^ which are systems with the usual spin degrees of freedom Si = ±1
for /=1,...,N, with the special fixed "random" couplings J^ = Jo e^. Here the e. and e fc quantities can
independently and "randomly" take the values ±1 , which corresponds to a most- simple gauge transformation
(si — > Si • 6i Jik — > £iJi,k£k s k —> s k ' £k) - This means that thermodynamic expectation values of
measurable quantities, e.g. of the energy *)-(,:= — ^ ^ s { J i k s fc , are invariant.
However, here the coupling degrees of freedom Ji^ , which in the QCD correspond to the gluons, are "frozen" to
fixed values (quenching). In contrast, in the QCD they "fluctuate" (annealing), and through the large number of
gauge degrees of freedom the entropy plays an important role (see below).
For positive Jo the thermodynamics of the Mattis spin glass corresponds in fact simply to a ferromagnet, just
because these systems have no "frustration" at all. This term is a basic measure in spin glass theory J 16 ^
Quantitatively it is identical with the loop-product Pw ' — Ji,kJk,l- ■ ■ Jn.mJTn.i along a closed loop W. However,
for a Mattis spin glass - in contrast to "genuine" spin glasses - the quantity never becomes negative.
The basic notion "frustration" of the spin-glass is actually similar to the Wilson loop quantity of the QCD. The only
difference is again that in the QCD one is dealing with SU(3) matrices, and that one is dealing with a "fluctuating"
quantity. Energetically, perfect absence of frustration should be non-favorable and untypical for a spin glass, which
means that one should add the loop-product to the Hamiltonian, by some kind of term representing a "punishment". -
In the QCD the Wilson loop is essential for the Lagrangian rightaway.
The relation between the QCD and "disordered magnetic systems" (the spin glasses belong to them) were
ri7i
additionally stressed in a paper by Fradkin, Huberman und Shenker, which also stresses the notion of duality.
A further analogy consists in the already mentioned similarity to polymer physics, where, analogously to Wilson
Loops, so-called "entangled nets" appear, which are important for the formation of the entropy-elasticity (force
proportional to the length) of a rubber band. The non-abelian character of the SU(3) corresponds thereby to the
non-trivial "chemical links", which glue different loop segments together, and "asymptotic freedom" means in the
polymer analogy simply the fact that in the short-wave limit, i.e. for 0 <— X w <^ R c (where is a characteristic
correlation-length for the glued loops, corresponding to the above-mentioned "bag radius", while X is the
ri8i w
wavelength of an excitation) any non-trivial correlation vanishes totally, as if the system had crystallized.
There is also a correspondence between confinement in QCD - the fact that the colour-field is only different from
zero in the interiour of hadrons - and the behaviour of the usual magnetic field in the theory of type-II
ri9i
superconductors: there the magnetism is confined to the interiour of the Abrikosov flux-line lattice, i.e., the
London penetration depth A of that theory is analogous to the confinement radius of quantum chromodynamics.
Mathematically, this correspondendence is supported by the second term, oc gG < j 1 'ipi'y fJ 'T?j'ifij ,on the r.h.s. of the
Lagrangian.
Quantum Chromodynamics
254
References
[I] Gell-Mann, Murray (1995). The Quark and the Jaguar. Owl Books. ISBN 978-0805072532.
[2] Fyodor Tkachov (2009). "A contribution to the history of quarks: Boris Struminsky's 1965 JINR publication". arXiv: 0904.0343
[physics. hist-ph].
[3] B. V. Struminsky, Magnetic moments of barions in the quark model. JINR-Preprint P-1939, Dubna, Russia. Submitted on January 7, 1965.
[4] N. Bogolubov, B. Struminsky, A. Tavkhelidze. On composite models in the theory of elementary particles. JINR Preprint D-1968, Dubna
1965.
[5] A. Tavkhelidze. Proc. Seminar on High Energy Physics and Elementary Particles, Trieste, 1965, Vienna IAEA, 1965, p. 763.
[6] V. A. Matveev and A. N. Tavkhelidze (INR, RAS, Moscow) The quantum number color, colored quarks and QCD (http://www.inr.ru/
quantum.html) (Dedicated to the 40th Anniversary of the Discovery of the Quantum Number Color). Report presented at the 99th Session of
the JINR Scientific Council, Dubna, 19-20 January 2006.
[7] J. Polchinski, M. Strassler (2002). "Hard Scattering and Gauge/String duality". Physical Review Letters 88: 31601.
doi:10.1103/PhysRevLett.88.031601.
[8] Brower, Richard C; Mathur, Samir D.; Chung-I Tan (2000). "Glueball Spectrum for QCD from AdS Supergravity Duality".
arXiv:hep-th/0003115 [hep-th].
[9] F. Wegner, Duality in Generalized Ising Models and Phase Transitions without Local Order Parameter, J. Math. Phys. 12 (1971) 2259-2272.
Reprinted in Claudio Rebbi (ed.), Lattice Gauge Theories and Monte Carlo Simulations, World Scientific,
Singapore (1983), p. 60-73. Abstract: (http://www. tphys.uni-heidelberg.de/~wegner/ Abstracts. html#12)
[10] Perhaps one can guess that in the "original" model mainly the quarks would fluctuate, whereas in the present one, the "dual" model, mainly
the gluons do.
[II] See all standard textbooks on the QCD, e.g., those noted above
[12] Only at extremely large pressures and or temperatures, e.g. for J 1 ^ g . ]_0-^ K or larger, confinement gives way to a quark-gluon
plasma.
[13] Kenneth A. Johnson, The bag model of quark confinement, Scientific American, July 1979
[14] http://www.aip.org/pnu/2005/split/731-l.html
[15] D.C. Mattis, Phys. Lett. 56a (1976) 421
[16] J. Vanninemus and G. Toulouse, J. Phys. C 10 (1977) 537
[17] E. Fradkin, B.A. Huberman, S. Shenker, Gauge Symmetries in random magnetic systems, Phys. Rev. B 18 (1978) 4783-4794, (http://prb.
aps . org/abstract/PRB/v 1 8/i9/p4879- 1 )
[18] A. Bergmann, A. Owen , Dielectric relaxation spectroscopy of poly[(R)-3-Hydroxybutyrate] (PHD) during crystallization, Polymer
International 53 (7) (2004) 863-868, (http://www3.interscience.wiley.com/journal/108563755/abstract?CRETRY=l&SRETRY=0)
[19] Mathematically, the llux-line lattices are described by Emil Artin's braid group, which is nonabelian, since one braid can wind around
another one.
Further reading
• Greiner, Walter ;S chafer, Andreas (1994). Quantum Chromodynamics. Springer. ISBN 0-387-57103-5.
• Halzen, Francis; Martin, Alan (1984). Quarks & Leptons: An Introductory Course in Modern Particle Physics.
John Wiley & Sons. ISBN 0-471-88741-2.
• Creutz, Michael (1985). Quarks, Gluons and Lattices. Cambridge University Press. ISBN 978-0521315357.
External links
• Particle data group (http://pdg.lbl.gov/)
• The millennium prize (http://www.claymath.org/millennium/) for proving confinement (http://www.
claymath. org/millennium/)
• Ab Initio Determination of Light Hadron Masses (http://www.sciencemag.org/cgi/content/abstract/322/
5905/1224)
• Andreas S Kronfeld (http://www.sciencemag.org/cgi/content/summary/322/5905/1198) The Weight of the
World Is Quantum Chromodynamics
• Andreas S Kronfeld (http://www.iop.Org/EJ/article/1742-6596/125/l/012067/jpconf8_125_012067.
pdf?request-id=f9ccdf0d-ee26-4856-99fb-ce5bfef07c4c) Quantum chromodynamics with advanced computing
• Standard model gets right answer (http://www.sciencenews.org/view/generic/id/38788/title/
Standard_model_gets_right_answer_for_proton,_neutron_masses)
Quantum Chromodynamics
255
• Quantum Chromodynamics (http ://arxiv. org/abs/hepph/950523 1 )
Quantum Geometry
In theoretical physics, quantum geometry is the set of new mathematical concepts generalizing the concepts of
geometry whose understanding is necessary to describe the physical phenomena at very short distance scales
(comparable to Planck length). At these distances, quantum mechanics has a profound effect on physics.
Each theory of quantum gravity uses the term quantum geometry in a slightly different fashion. String theory, a
leading candidate for a quantum theory of gravity, uses the term quantum geometry to describe exotic phenomena
such as T-duality and other geometric dualities, mirror symmetry, topology-changing transitions, minimal possible
distance scale, and other effects that challenge our usual geometrical intuition. More technically, quantum geometry
refers to the shape of the spacetime manifold as seen by D-branes which includes the quantum corrections to the
metric tensor, such as the worldsheet instantons. For example, the quantum volume of a cycle is computed from the
mass of a brane wrapped on this cycle.
In an alternative approach to quantum gravity called loop quantum gravity (LQG), the phrase quantum geometry
usually refers to the formalism within LQG where the observables that capture the information about the geometry
are now well defined operators on a Hilbert space. In particular, certain physical observables, such as the area, have a
discrete spectrum. It has also been shown that the loop quantum geometry is non-commutative.
It is possible (but considered unlikely) that this strictly quantized understanding of geometry will be consistent with
the quantum picture of geometry arising from string theory.
Another, quite successful, approach, which tries to reconstruct the geometry of space-time from "first principles" is
Discrete Lorentzian quantum gravity.
External links
• Space and Time: From Antiquity to Einstein and Beyond ^
• Quantum Geometry and its Applications
• Hypercomplex Numbers in Geometry and Physics
References
[1] http://cgpg.gravity.psu.edu/people/Ashtekar/articles/spaceandtime.pdf
[2] http://cgpg.gravity.psu.edu/people/Ashtekar/articles/qgfinal.pdf
[3] http://hypercomplex.xpsweb.com/articles/221/en/pdf/main-01e.pdf
Loop Quantum Gravity
256
Loop Quantum Gravity
Loop quantum gravity (LQG), also known as loop gravity and quantum geometry, is a proposed quantum theory
of spacetime which attempts to reconcile the theories of quantum mechanics and general relativity. Loop quantum
gravity suggests that space can be viewed as an extremely fine fabric or network "woven" of finite quantised loops of
excited gravitational fields called spin networks. When viewed over time, these spin networks are called spin foam,
which should not be confused with quantum foam. A major quantum gravity contender with string theory, loop
quantum gravity incorporates general relativity without requiring string theory's higher dimensions.
LQG preserves many of the important features of general relativity, while simultaneously employing quantization of
both space and time at the Planck scale in the tradition of quantum mechanics. The technique of loop quantization
was developed for the nonperturbative quantization of diffeomorphism-invariant gauge theory. Roughly, LQG tries
to establish a quantum theory of gravity in which the very space itself, where all other physical phenomena occur,
becomes quantized.
LQG is one of a family of theories called canonical quantum gravity. The LQG theory also includes matter and
forces, but does not address the problem of the unification of all physical forces the way some other quantum gravity
theories such as string theory do.
History of LQG
In 1986, Abhay Ashtekar reformulated Einstein's field equations of general relativity, using what have come to be
known as Ashtekar variables, a particular flavor of Einstein-Cartan theory with a complex connection. In 1988, Carlo
Rovelli and Lee Smolin used this formalism to introduce the loop representation of quantum general relativity,
which was soon developed by Ashtekar, Rovelli, Smolin and many others. In the Ashtekar formulation, the
fundamental objects are a rule for parallel transport (technically, a connection) and a coordinate frame (called a
vierbein) at each point. Because the Ashtekar formulation was background-independent, it was possible to use
Wilson loops as the basis for a nonperturbative quantization of gravity. Explicit (spatial) diffeomorphism invariance
of the vacuum state plays an essential role in the regularization of the Wilson loop states.
Around 1990, Rovelli and Smolin obtained an explicit basis of states of quantum geometry, which turned out to be
labelled by Roger Penrose's spin networks, and showed that the geometry is quantized, that is, the
(non-gauge-invariant) quantum operators representing area and volume have a discrete spectrum. In this context,
spin networks arose as a generalization of Wilson loops necessary to deal with mutually intersecting loops.
Mathematically, spin networks are related to group representation theory and can be used to construct knot invariants
such as the Jones polynomial.
Key concepts of loop quantum gravity
In the framework of quantum field theory, and using the standard techniques of perturbative calculations, one finds
that gravitation is non-renormalizable in contrast to the electroweak and strong interactions of the Standard Model of
particle physics. This implies that there are infinitely many free parameters in the theory and thus that it cannot be
predictive.
In general relativity, the Einstein field equations assign a geometry (via a metric) to space-time. Before this, there is
no physical notion of distance or time measurements. In this sense, general relativity is said to be background
independent. An immediate conceptual issue that arises is that the usual framework of quantum mechanics, including
quantum field theory, relies on a reference (background) space-time. Therefore, one approach to finding a quantum
theory of gravity is to understand how to do quantum mechanics without relying on such a background; this is the
approach of the canonical quantization/loop quantum gravity/spin foam approaches.
Loop Quantum Gravity
257
Starting with the initial-value-formulation of general
relativity (cf. the section on General
relativity#Evolution equations), the result is an
analogue of the Schrodinger equation called the
Wheeler-deWitt equation, which some argue is
ill-defined A major break-through came with the
introduction of what are now known as Ashtekar
variables, which represent geometric gravity using
mathematical analogues of electric and magnetic
fields. The resulting candidate for a theory of
quantum gravity is Loop quantum gravity, in which
space is represented by a network structure called a
spin network, evolving over time in discrete steps.
[3]
Simple spin network of the type used in loop quantum gravity
Though not proven, it may be impossible to quantize
gravity in 3+1 dimensions without creating matter and
energy artifacts. Should LQG succeed as a quantum theory of gravity, the known matter fields will have to be
incorporated into the theory a posteriori. Many of the approaches now being actively pursued (by Renate Loll, Jan
Ambj0rn, Lee Smolin, Sundance Bilson-Thompson, Laurent Freidel, Mark B. Wise and others ^ ) combine matter
with geometry.
The main successes of loop quantum gravity are:
1. It is a nonperturbative quantization of 3-space geometry, with quantized area and volume operators.
2. It includes a calculation of the entropy of black holes.
3. It replaces the Big Bang spacetime singularity with a Big Bounce.
These claims are not universally accepted among the physics community, which is presently divided between
different approaches to the problem of quantum gravity. LQG may possibly be viable as a refinement of either
gravity or geometry. Many of the core results are rigorous mathematical physics; their physical interpretations
remain speculative. Three speculative physical interpretations of LQG's core mathematical results are loop
quantization, Lorentz invariance, General covariance and background independence, discussed below. Another
physical test for LQG is to reproduce the physics of general relativity coupled with quantum field theory, discussed
under problems.
Loop quantization
At the core of loop quantum gravity is a framework for nonperturbative quantization of diffeomorphism-invariant
gauge theories, which one might call loop quantization. While originally developed in order to quantize vacuum
general relativity in 3+1 dimensions, the formalism can accommodate arbitrary spacetime dimensionalities,
fermions,^ an arbitrary gauge group (or even quantum group), and super symmetry/ 6 ^ and results in a quantization
of the kinematics of the corresponding diffeomorphism-invariant gauge theory. Much work remains to be done on
the dynamics, the classical limit and the correspondence principle, all of which are necessary in one way or another
to make contact with experiment.
In a nutshell, loop quantization is the result of applying C*-algebraic quantization to a non-canonical algebra of
gauge-invariant classical observables. Non-canonical means that the basic observables quantized are not generalized
coordinates and their conjugate momenta. Instead, the algebra generated by spin network observables (built from
holonomies) and field strength fluxes is used.
Loop quantization techniques are particularly successful in dealing with topological quantum field theories, where
they give rise to state-sum/spin-foam models such as the Turaev-Viro model of 2+1 dimensional general relativity. A
Loop Quantum Gravity
258
much studied topological quantum field theory is the so-called BF theory in 3+1 dimensions. Since classical general
relativity can be formulated as a BF theory with constraints, scientists hope that a consistent quantization of gravity
may arise from the perturbation theory of BF spin-foam models.
Lorentz in variance
LQG is a quantization of a classical Lagrangian field theory which is equivalent to the usual Einstein-Cartan theory
in that it leads to the same equations of motion describing general relativity with torsion. As such, it can be argued
that LQG respects local Lorentz invariance. Global Lorentz invariance is broken in LQG just as in general relativity.
A positive cosmological constant can be realized in LQG by replacing the Lorentz group with the corresponding
quantum group.
General covariance and background independence
General covariance, also known as "diffeomorphism invariance", is the invariance of physical laws under arbitrary
coordinate transformations. An example of this are the equations of general relativity, where this symmetry is one of
the defining features of the theory. LQG preserves this symmetry by requiring that the physical states remain
invariant under the generators of diffeomorphisms. The interpretation of this condition is well understood for purely
spatial diffemorphisms. However, the understanding of diffeomorphisms involving time (the Hamiltonian constraint)
T71
is more subtle because it is related to dynamics and the so-called problem of time in general relativity. A generally
accepted calculational framework to account for this constraint is yet to be found ^
Whether or not Lorentz invariance is broken in the low-energy limit of LQG, the theory is formally background
independent. The equations of LQG are not embedded in, or presuppose, space and time, except for its invariant
topology. Instead, they are expected to give rise to space and time at distances which are large compared to the
Planck length.
Problems
While there has been a recent proposal relating to observation of naked singularities, tl0] and doubly special
relativity, as a part of a program called loop quantum cosmology, as of now there is no experimental observation for
which loop quantum gravity makes a prediction not made by the Standard Model or general relativity (a problem that
plagues all current theories of quantum gravity).
Making predictions from the theory of LQG has been extremely difficult computationally, also a recurring problem
with modern theories in physics.
Another problem is that a crucial free parameter in the theory known as the Immirzi parameter can only be computed
by demanding agreement with Bekenstein and Hawking's calculation of the black hole entropy. Loop quantum
gravity predicts that the entropy of a black hole is proportional to the area of the event horizon, but does not obtain
the Bekenstein-Hawking formula S = A/4 unless the Immirzi parameter is chosen to give this value. A prediction
directly from theory would be preferable.
Presently, no semiclassical limit recovering general relativity has been shown to exist. This means it remains
unproven that LQG's description of spacetime at the Planck scale has the right continuum limit, described by general
relativity with possible quantum corrections. Specifically, the dynamics of the theory is encoded in the Hamiltonian
constraint, but there is no candidate Hamiltonian (quantum mechanics). Other technical problems includes
finding off-shell closure of the constraint algebra and physical inner product vector space, coupling to matter fields
Quantum field theory, fate of the Renormalization of the graviton in Perturbation theory that lead to Ultraviolet
divergence beyond 2-loops One-loop Feynman diagram in Feynman diagram. . The fate of Lorentz invariance in
loop quantum gravity remains an open problem. ^ 13 ^
Loop Quantum Gravity
259
Current LQG research directions attempt to address these known problems, and includes spinfoam models and
entropic gravity tl5] .
See also
• Loop quantum cosmology
• Heyting algebra
• Category theory
• Noncommutative geometry
• Topos theory
• C*-algebra
• Regge calculus
• Double special relativity
• Lorentz in variance in loop quantum gravity
• Immirzi parameter
• Invariance mechanics
• spin foam
• group field theory
• string-net
• Kodama state
• supersymmetry
Notes
[I] Cf. section 3 in Kuchaf 1973.
[2] See Ashtekar 1986, Ashtekar 1987.
[3] For a review, see Thiemann 2006; more extensive accounts can be found in Rovelli 1998, Ashtekar & Lewandowski 2004 as well as in the
lecture notes Thiemann 2003.
[4] See List of loop quantum gravity researchers
[5] Baez, John, Krasnov,Kirill : Quantization of Diffeomorphism-Invariant Theories with Fermions arXiv:hep-th/97031 12
[6] Ling,Yi; Smolin, Lee : Supersymmetric Spin Networks and Quantum Supergravity http://www.arxiv.org/abs/hep-th/9904016
[7] See, e.g., Stuart Kauffman and Lee Smolin "A Possible Solution For The Problem Of Time In Quantum Cosmology" (1997). (http://www.
edge . org/ 3rd_culture/ smolin/ smolin_p 1 . html)
[8] See, e.g., Lee Smolin, "The Case for Background Independence", in Dean Rickles, et al. (eds.) The Structural Foundations of Quantum
Gravity (2006), p 196 ff.
[9] For a highly technical explanation, see, e.g., Carlo Rovelli (2004). Quantum Gravity, p 13 ff.
[10] Goswami et al.. "Quantum evaporation of a naked singularity" (http://arxiv.org/abs/gr-qc/0506129). Physical Review Letters. . Retrieved
2010-01-02.
[II] http://arxiv.org/abs/hep-th/0501 1 14
[12] http://arxiv.org/abs/hep-th/0501 1 14
[13] "Testing Einstein's special relativity with Fermi's short hard gamma-ray burst GRB090510" (http://arxiv.org/abs/0908. 1832). Fermi
GBM/LAT Collaborations. . Retrieved August 13, 2009.
[ 1 4] http :// math. ucr . edu/home/baez/ week280 . html
[15] Newtonian gravity in loop quantum gravity (http://arxiv.org/abs/1001.3668vl), Lee Smolin, 2010
Loop Quantum Gravity
260
References
• Topical Reviews
• Rovelli, Carlo (1998), "Loop Quantum Gravity" (http://www.livingreviews.org/lrr-1998-l), Living Rev.
Relativity 1, retrieved 2008-03-13
• Thiemann, Thomas (2003), "Lectures on Loop Quantum Gravity" (http://arxiv.org/abs/gr-qc/0210094),
Lect. Notes Phys. 631: 41-135
• Ashtekar, Abhay; Lewandowski, Jerzy (2004), "Background Independent Quantum Gravity: A Status Report"
(http://arxiv.org/abs/gr-qc/0404018), Class. Quant. Grav. 21: R53-R152,
doi:10.1088/0264-9381/21/15/R01,arXiv:gr-qc/0404018
• Carlo Rovelli and Marcus Gaul, Loop Quantum Gravity and the Meaning of Diffeomorphism Invariance,
e-print available as gr-qc/9910079 (http://arxiv.org/abs/gr-qc/9910079).
• Lee Smolin, The case for background independence, e-print available as hep-th/0507235 (http://arxiv.org/
abs/hep-th/0507235).
• Alejandro Corichi, Loop Quantum Geometry: A primer, e-print available as (http://arxiv.org/abs/gr-qc/
0507038v2).
• Alejandro Perez, Introduction to loop quantum gravity and spin foams, e-print available as (http://arxiv.org/
abs/gr-qc/0409061v3).
• Hermann Nicolai and Kasper Peeters Loop and spin foam quantum gravity: A Brief guide for beginners.,
e-print available as (http://arxiv.org/abs/hep-th/0601129v2).
• Popular books:
• Lee Smolin, Three Roads to Quantum Gravity
• Carlo Rovelli, Che cos'e il tempo? Che cos'e lo spazio?, Di Renzo Editore, Roma, 2004. French translation:
Qu'est ce que le temps? Qu'est ce que Vespace?, Bernard Gilson ed, Brussel, 2006. English translation: What is
Time? What is space?, Di Renzo Editore, Roma, 2006.
• Julian Barbour, The End of Time: The Next Revolution in Our Understanding of the Universe
• Musser, George (2008), The Complete Idiot's Guide to String Theory, Indianapolis: Alpha, pp. 368,
ISBN 978-1-59-257702-6 - Focuses on string theory but has an extended discussion of loop gravity as well.
• Magazine articles:
• Lee Smolin, "Atoms of Space and Time," Scientific American, January 2004
• Martin Bojowald, "Following the Bouncing Universe," Scientific American, October 2008
• Easier introductory, expository or critical works:
• Abhay Ashtekar, Gravity and the quantum, e-print available as gr-qc/04 10054 (http://arxiv.org/abs/gr-qc/
0410054) (2004)
• John C. Baez and Javier Perez de Muniain, Gauge Fields, Knots and Quantum Gravity, World Scientific
(1994)
• Carlo Rovelli, A Dialog on Quantum Gravity, e-print available as hep-th/03 10077 (http://arxiv.org/abs/
hep-th/03 10077) (2003)
• More advanced introductory/expository works:
• Carlo Rovelli, Quantum Gravity, Cambridge University Press (2004); draft available online (http://www.cpt.
univ-mrs.fr/~rovelli/book.pdf)
• Thomas Thiemann, Introduction to modern canonical quantum general relativity, e-print available as
gr-qc/01 10034 (http://arxiv.org/abs/gr-qc/01 10034)
• Thomas Thiemann, Introduction to Modern Canonical Quantum General Relativity, Cambridge University
Press (2007)
• Abhay Ashtekar, New Perspectives in Canonical Gravity, Bibliopolis (1988).
• Abhay Ashtekar, Lectures on Non-Perturbative Canonical Gravity, World Scientific (1991)
Loop Quantum Gravity
261
• Rodolfo Gambini and Jorge Pullin, Loops, Knots, Gauge Theories and Quantum Gravity, Cambridge
University Press (1996)
• Hermann Nicolai, Kasper Peeters, Marija Zamaklar, Loop quantum gravity: an outside view, e-print available
as hep-th/0501114 (http://arxiv.org/abs/hep-th/0501114)
• H. Nicolai and K. Peeters, Loop and Spin Foam Quantum Gravity: A Brief Guide for Beginners, e-print
available as hep-th/0601129 (http://lanl.arxiv.org/abs/hep-th/0601129)
• T. Thiemann The LQG - String: Loop Quantum Gravity Quantization of String Theory (http://arxiv.org/abs/
hep-th/0401172vl)(2004)
• Conference proceedings:
• John C. Baez (ed.), Knots and Quantum Gravity
• Fundamental research papers:
• Ashtekar, Abhay (1986), "New variables for classical and quantum gravity", Phys. Rev. Lett. 57 (18):
2244-2247, doi:10.1103/PhysRevLett.57.2244, PMID 10033673
• Ashtekar, Abhay (1987), "New Hamiltonian formulation of general relativity", Phys. Rev. D36: 1587-1602,
doi:10.1103/PhysRevD.36.1587
• Roger Penrose, Angular momentum: an approach to combinatorial space-time in Quantum Theory and
Beyond, ed. Ted Bastin, Cambridge University Press, 1971
• Carlo Rovelli and Lee Smolin, Knot theory and quantum gravity, Phys. Rev. Lett., 61 (1988) 1155
• Carlo Rovelli and Lee Smolin, Loop space representation of quantum general relativity, Nuclear Physics B331
(1990) 80-152
• Carlo Rovelli and Lee Smolin, Discreteness of area and volume in quantum gravity, Nucl. Phys., B442 (1995)
593-622, e-print available as gr-qc/9411005 (http://xxx.lanl.gov/abs/gr-qc/9411005)
• Kuchar, Karel (1973), "Canonical Quantization of Gravity", in Israel, Werner, Relativity, Astrophysics and
Cosmology, D. Reidel, pp. 237-288, ISBN 90-277-0369-8
• Thiemann, Thomas (2006), Loop Quantum Gravity: An Inside View, arXiv:hep-th/0608210
External links
• "Loop Quantum Gravity" by Carlo Rovelli (http://cgpg.gravity.psu.edu/people/Ashtekar/articles/rovelli03.
pdf) Physics World, November 2003
• Quantum Foam and Loop Quantum Gravity (http://universe-review.ca/R01-07-quantumfoam.htm)
• Abhay Ashtekar: Semi-Popular Articles . Some excellent popular articles suitable for beginners about space, time,
GR, and LQG. (http://cgpg.gravity.psu.edu/people/Ashtekar/articles.html)
• Loop Quantum Gravity: Lee Smolin. (http://www.edge.org/3rd_culture/smolin03/smolin03_index.html)
• Loop Quantum Gravity on arxiv.org (http://xstructure.inr.ac. ru/x-bin/theme3.py?level=2&index 1=2056 15)
• A list of LQG references catered to fresh graduates (http://sps.nus.edu.sg/~wongjian/lqg.html)
• Loop Quantum Gravity Lectures Online (http://www.perimeterinstitute.ca/Events/
Introduction_to_Quantum_Gravity/Introduction_to_Quantum_Gravity/) by Lee Smolin
• Spin networks, spin foams and loop quantum gravity (http://jdc.math.uwo.ca/spin-foams/)
• Wired magazine, News: Moving Beyond String Theory (http ://www. wired. com/news/technology/0,7 1828-0.
html)
• April 2006 Scientific American Special Issue, A Matter of Time, has Lee Smolin LQG Article Atoms of Space and
Time (http :// www. sciam.com/special/toc. cfm?issueid=40&sc=rt_nav_list)
• September 2006, The Economist, article Looping the loop (http://www.economist.com/science/displaystory.
cfm?story_id=7963608)
• Gamma-ray Large Area Space Telescope: http://glast.gsfc.nasa.gov/
Loop Quantum Gravity
262
• Zeno meets modern science, (http://uk.arxiv.org/abs/physics/0505042) Article from Acta Physica Polonica B
(http://th-www.if.uj.edu.pl/acta/) by Z.K. Silagadze.
• Did pre-big bang universe leave its mark on the sky? (http://space.newscientist.com/article/mgl9826514.
300-did-prebig-bang-universe-leave-its-mark-on-the-sky.html?feedld=online-news_rss20) - According to a
model based on "loop quantum gravity" theory, a parent universe that existed before ours may have left an imprint
{New Scientist, 10 April 2008)
Quantum Algebraic Topology
In physics, topological order ^ is a new kind of order (a new kind of organization of particles) in a quantum state
that is beyond the Landau symmetry-breaking description. It cannot be described by local order parameters and long
range correlations. However, topological orders can be described by a new set of quantum numbers, such as ground
state degeneracy, quasiparticle fractional statistics, edge states, topological entropy, etc. Roughly speaking,
topological order is a pattern of long-range quantum entanglement in quantum states. States with different
topological orders can change into each other only through a phase transition.
Background
Although all matter is formed by atoms, matter can have very different properties and appear in very different forms,
such as solid, liquid, superfluid, magnet, etc. According to condensed matter physics and the principle of emergence,
the different properties of materials originate from the different ways in which the atoms are organized in the
materials. Those different organizations of the atoms (or other particles) are formally called the orders in the
materials.
Atoms can organize in many ways which lead to many different orders and many different types of materials.
Landau symmetry-breaking theory provides a general understanding of these different orders. It points out that
different orders really correspond to different symmetries in the organizations of the constituent atoms. As a material
changes from one order to another order (i.e., as the material undergoes a phase transition), what happens is that the
symmetry of the organization of the atoms changes.
For example, atoms have a random distribution in a liquid, so a liquid remains the same as we displace it by an
arbitrary distance. We say that a liquid has a continuous translation symmetry. After a phase transition, a liquid can
turn into a crystal. In a crystal, atoms organize into a regular array (a lattice). A lattice remains unchanged only when
we displace it by a particular distance (integer times of lattice constant), so a crystal has only discrete translation
symmetry. The phase transition between a liquid and a crystal is a transition that reduces the continuous translation
symmetry of the liquid to the discrete symmetry of the crystal. Such change in symmetry is called symmetry
breaking. The essence of the difference between liquids and crystals is therefore that the organizations of atoms have
different symmetries in the two phases.
Landau symmetry-breaking theory is a very successful theory. For a long time, physicists believed that Landau
symmetry-breaking theory describes all possible orders in materials, and all possible (continuous) phase transitions.
The discovery and characterization of topological order
However, since late 1980s, it has become gradually apparent that Landau symmetry-breaking theory may not
describe all possible orders. In an attempt to explain high temperature superconductivity people introduced chiral
spin state. ^ ^ At first, physicists still wanted to use Landau symmetry-breaking theory to describe the chiral spin
state. They identified the chiral spin state as a state that breaks the time reversal and parity symmetries, but not the
spin rotation symmetry. This should be the end of story according to Landau's symmetry breaking description of
orders. However, it was quickly realized that there are many different chiral spin states that have exactly the same
Quantum Algebraic Topology
263
symmetry, so symmetry alone was not enough to characterize different chiral spin states. This means that the chiral
spin states contain a new kind of order that is beyond the usual symmetry description ^ . The proposed, new kind of
order was named "topological order" ^ . (The name "topological order" is motivated by the low energy effective
theory of the chiral spin states which is a topological quantum field theory (TQFT) ^ ^ ^ ). New quantum
numbers, such as ground state degeneracy ^ and the non-Abelian Berry's phase of degenerate ground states .
were introduced to characterize the different topological orders in chiral spin states. Recently, it was shown that
ri2i r 1 3i
topological orders can also be characterized by topological entropy .
But experiments soon indicated that chiral spin states do not describe high-temperature superconductors, and the
theory of topological order became a theory with no experimental realization. However, the similarity between chiral
spin states and quantum Hall states allows one to use the theory of topological order to describe different quantum
Hall states. ^ Just like chiral spin states, different quantum Hall states all have the same symmetry and are beyond
the Landau symmetry-breaking description. One finds that the different orders in different quantum Hall states can
indeed be described by topological orders, so the topological order does have experimental realizations.
Fractional quantum Hall (FQH) states were discovered in 1982 ^ ^ before the introduction of the concept of
topological order. But FQH states are not the first experiemntally discovered topologically ordered states.
Superconductors discovered in 1911 are the first, which have a Z2 topological order (note, however, that
superconductivity does fall within the Ginzburg-Landau theory description of phase transitions - in fact the
prediction of the vortex state in superocnductors was one of the main successes of Ginzburg-Landau theory). tl7] tl8]
Mechanism of topological order
ri9i
A large class of topological orders is realized through a mechanism called string-net condensation . This class of
topological orders can be classified by utilizing tensor category (or monoidal category) theory. One finds that
string-net condensation can generate infinitely many different types of topological orders, which may indicate that
there are many different new types of materials remaining to be discovered.
The collective motions of condensed strings give rise to excitations above the string-net condensed states. Those
excitations turn out to be gauge bosons. The ends of strings are defects which correspond to another type of
excitations. Those excitations are the gauge charges and can carry Fermi or fractional statistics. t20]
The condensations of other extended objects such as "membranes", 1 "brane-nets" , L and fractals also lead to
topologically ordered phases ^ and "quantum glassiness" ^
Mathematical foundation of topological order
We know that group theory is the mathematical foundation of symmetry breaking orders. What is the mathematical
foundation of topological order? The string-net condensation suggests that tensor category (or monoidal category)
theory may be the mathematical foundation of topological order. Quantum operator algebra is a very important
mathematical tool in studying topological orders. Some also suggest that topological order is mathematically
described by extended quantum symmetry
Applications
The materials described by Landau symmetry-breaking theory have had a substantial impact on technology. For
example, Ferromagnetic materials that break spin rotation symmetry can be used as the media of digital information
storage. A hard drive made of ferromagnetic materials can store gigabytes of information. Liquid crystals that break
the rotational symmetry of molecules find wide application in display technology; nowadays one can hardly find a
household without a liquid crystal display somewhere in it. Crystals that break translation symmetry lead to well
defined electronic bands which in turn allow us to make semiconducting devices such as transistors.
Quantum Algebraic Topology
264
Different types of Topologically orders are even richer than different types of symmetry-breaking orders. This
suggests their potential for exciting, novel applications.
One theorized application would be to use topologically ordered states as media for quantum computing in a
technique known as topological quantum computing. A topologically ordered state is a state with complicated
non-local quantum entanglement. The non-locality means that the quantum entanglement in a topologically ordered
state is distributed among many different particles. As a result, the pattern of quantum entanglements cannot be
destroyed by local perturbations. This significantly reduces the effect of decoherence. This suggests that if we use
different quantum entanglements in a topologically ordered state to encode quantum information, the information
may last much longer. ^ The quantum information encoded by the topological quantum entanglements can also be
manipulated by dragging the topological defects around each other. This process may provide a physical apparatus
for performing quantum computations. Therefore, topologically ordered states may provide natural media for
both quantum memory and quantum computation. Such realizations of quantum memory and quantum computation
T281
may potentially be made fault tolerant.
Topologically ordered states in general have a special property that they contain non-trivial boundary states. In many
cases, those boundary states become perfect conducting channel that can conduct electricity without generating
T291
heat. LZ - 7J This can be another potential application of topological order in electronic devices. For example, the
topological insulator ^ ^ is a simple example of topological order. The boundary states of topological insulator
play a key role in the detection and the application of topological insulators.
Potential impact
Why is topological order important? Landau symmetry-breaking theory is a cornerstone of condensed matter
physics. It is used to define the territory of condensed matter research. The existence of topological order appears to
indicate that nature is much richer than Landau symmetry-breaking theory has so far indicated. The exciting time of
condensed matter physics is still ahead of us. Some suggest that topological order (or more precisely, string-net
T321
condensation) and the local bosonic (spin) models have the potential to provide a unified origin for photons,
T331
electrons and other elementary particles in our universe.
References
[I] Xiao-Gang Wen, Topological Orders in Rigid States, (http://dao.mit.edu/~wen/pub/topo.pdf) Int. J. Mod. Phys. B4, 239 (1990)
[2] .G. Bednorz and K.A. Mueller (1986). "Possible high TC superconductivity in the Ba-La-Cu-O system". Z. Phys. B64 (2): 189-193.
doi:10.1007/BF01303701.
[3] V. Kalmeyer and R. B. Laughlin, Phys. Rev. Lett., 59, 2095 (1987), "Equivalence of the resonating-valence-bond and fractional quantum Hall
states"
[4] Xiao-Gang Wen, F. Wilczek and A. Zee, Phys. Rev., B39, 11413 (1989), "Chiral Spin States and Superconductivity"
[5] Xiao-Gang Wen, Phys. Rev. B, 40, 7387 (1989), "Vacuum Degeneracy of Chiral Spin State in Compactified Spaces"
[6] Xiao-Gang Wen, Intl. J. Mod. Phys., B4, 239 (1990), "Topological Orders in Rigid States" (http://dao.mit.edu/~wen/pub/topo.pdf)
[7] Atiyah, Michael (1988), "Topological quantum field theories", Publications Mathe'matiques de 1'IHeS (68): 175-186, MR1001453, ISSN
1618-1913, http://www.numdam.org/item?id=PMIHES_1988_68__175_0
[8] Witten, Edward (1988), "Topological quantum field theory", Communications in Mathematical Physics 777 (3): 353—386, MR953828, ISSN
0010-3616, http://projecteuclid.Org/euclid.cmp/l 104161738
[9] Yetter D.N., TQFTs from homotopy 2-types, J. Knot Theory 2 (1993),113-123.
[10] Xiao-Gang Wen, Phys. Rev. B, 40, 7387 (1989), "Vacuum Degeneracy of Chiral Spin State in Compactified Spaces"
[II] Xiao-Gang Wen, Intl. J. Mod. Phys., B4, 239 (1990), "Topological Orders in Rigid States" (http://dao.mit.edu/~wen/pub/topo.pdf)
[12] Alexei Kitaev and John Preskill, Phys. Rev. Lett. 96, 110404 (2006), "Topological Entanglement Entropy"
[13] Levin M. and Wen X-G., Detecting topological order in a ground state wave function., Phys. Rev. Letts. ,96(11), 110405, (2006) (http://
link.aps.org/doi/10. 1 103/PhysRevLett.96. 1 10405)
[14] Xiao-Gang Wen and Qian Niu, Phys. Rev. B41, 9377 (1990), "Ground state degeneracy of the FQH states in presence of random potential
and on high genus Riemann surfaces" (http://dao.mit.edu/~wen/pub/topWN.pdf)
[15] D. C. Tsui and H. L. Stormer and A. C. Gossard, Phys. Rev. Lett., 48, 1559 (1982), "Two-Dimensional Magnetotransport in the Extreme
Quantum Limit"
Quantum Algebraic Topology
265
[16] R. B. Laughlin, Phys. Rev. Lett., 50, 1395 (1983), "Anomalous Quantum Hall Effect: An Incompressible Quantum Fluid with Fractionally
Charged Excitations"
[17] Xiao-Gang Wen, Mean Field Theory of Spin Liquid States with Finite Energy Gaps and Topological Orders, Phys. Rev. B44, 2664 (1991)
(http://link.aps.org/doi/10. 1 103/PhysRevB.44.2664).
[18] T. H. Hansson, Vadim Oganesyan, S. L. Sondhi, Superconductors are topologically ordered (http://arxiv.org/abs/cond-mat/0404327),
Annals Of Physics vol. 313, 497 (2004)
[19] Michael Levin, Xiao-Gang Wen, Phys. Rev. B, 71, 045110 (2005), "String-net condensation: A physical mechanism for topological phases"
[20] Levin M. and Wen X-G., Fermions, strings, and gauge fields in lattice spin models., Phys. Rev. B 67, 245316, (2003).
[21] Hamma etal, 2005
[22] Bombin, M.A. Martin-Delgado, 2006
[23] Xiao-Gang Wen, Int. J. Mod. Phys. B5, 1641 (1991); Topological Orders and Chern-Simons Theory in strongly correlated quantum liquid, a
review containing comments on topological orders in higher dimensions and/or in Higgs phases; also introduced a dimension index (DI) to
characterize the robustness of the ground state degeneracy of a topologically ordered state. If DI is less or equal to 1 , then topological orders
cannot exist at finite temperature.
[24] Quantum Glas sines s.,Chamon C, Phys. Rev. Lett., 94, 040402, (2005). (http://link.aps.org/doi/10.1103/PhysRevLett.94.040402)
[25] Algebraic Topology Foundations of Supersymmetry and Symmetry Breaking in Quantum Field Theory and Quantum Gravity: A Review.,
Baianu, I.C., J.F. Glazebrook and R. Brown., SIGMA-08 1030,(2009), 78 pages.
[26] Eric Dennis, Alexei Kitaev, Andrew Landahl, and John Preskill, J. Math. Phys., 43, 4452 (2002), Topological quantum memory
[27] Michael H. Freedman, Alexei Kitaev, Michael J. Larsen, and Zhenghan Wang, Bull. Amer. Math. Soc, 40, 31 (2003), "Topological
quantum computation"
[28] A. Yu. Kitaev Ann. Phys. (N.Y.), 303, 1 (2003), Fault-tolerant quantum computation by anyons
[29] Xiao-Gang Wen, Phys. Rev. B, 43, 11025 (1991), "Gapless Boundary Excitations in the FQH States and in the Chiral Spin States" (http://
dao.mit.edu/~wen/pub/bdry.pdf)
[30] S. Murakami, N. Nagaosa, and S.-C. Zhang, Phys. Rev. Lett. 93, 156804 (2004).
[31] C. Kane and E. Mele, Phys. Rev. Lett. 95, 226801 (2005).
[32] http://arxiv.org/abs/hep-th/0507 1 1 8v2
[33] Levin M. and Wen X-G., Colloquium: Photons and electrons as emergent phenomena, Rev. Mod. Phys. 77, 871 (2005) (http://arxiv.org/
abs/hep-th/0507118v2), 4 pages; also, Quantum ether: Photons and electrons from a rotor model., arXiv:hep-th/0507118 (2007).
References by categories
Fractional quantum Hall states
• D. C. Tsui and H. L. Stormer and A. C. Gossard, Phys. Rev. Lett., 48, 1559 (1982), "Two-Dimensional
Magnetotransport in the Extreme Quantum Limit"
• R. B. Laughlin, Phys. Rev. Lett., 50, 1395 (1983), "Anomalous Quantum Hall Effect: An Incompressible
Quantum Fluid with Fractionally Charged Excitations"
Chiral spin states
• V. Kalmeyer and R. B. Laughlin, Phys. Rev. Lett., 59, 2095 (1987), "Equivalence of the resonating-valence-bond
and fractional quantum Hall states"
• Xiao-Gang Wen, F. Wilczek and A. Zee, Phys. Rev., B39, 1 1413 (1989), "Chiral Spin States and
Superconductivity"
Quantum Algebraic Topology
266
Early characterization of FQH states
• Off-diagonal long-range order, oblique confinement, and the fractional quantum Hall effect, S. M. Girvin and A.
H. MacDonald, Phys. Rev. Lett., 58, 1252 (1987)
• Effective-Field-Theory Model for the Fractional Quantum Hall Effect, S. C. Zhang and T. H. Hansson and S.
Kivelson, Phys. Rev. Lett., 62, 82 (1989)
Topological order
• Xiao-Gang Wen, Phys. Rev. B, 40, 7387 (1989), "Vacuum Degeneracy of Chiral Spin State in Compactified
Spaces"
• Xiao-Gang Wen, Int. J. Mod. Phys., B4, 239 (1990), "Topological Orders in Rigid States" (http://dao.mit.edu/
~wen/pub/topo.pdf)
• Xiao-Gang Wen, Quantum Field Theory of Many Body Systems — From the Origin of Sound to an Origin of Light
and Electrons, Oxford Univ. Press, Oxford, 2004.
Characterization of topological order
• D. Arovas and J. R. Schrieffer and F. Wilczek, Phys. Rev. Lett., 53, 722 (1984), "Fractional Statistics and the
Quantum Hall Effect"
• Xiao-Gang Wen and Qian Niu, Phys. Rev. B41, 9377 (1990), "Ground state degeneracy of the FQH states in
presence of random potential and on high genus Riemann surfaces" (http://dao.mit.edu/~wen/pub/topWN.
pdf)
• Xiao-Gang Wen, Phys. Rev. B, 43, 1 1025 (1991), "Gapless Boundary Excitations in the FQH States and in the
Chiral Spin States" (http://dao.mit.edu/~wen/pub/bdry.pdf)
• Alexei Kitaev and John Preskill, Phys. Rev. Lett. 96, 1 10404 (2006), "Topological Entanglement Entropy"
• Michael Levin and Xiao-Gang Wen, Phys. Rev. Lett. 96, 1 10405 (2006), "Detecting Topological Order in a
Ground State Wave Function" (http://link.aps.org/doi/10.1103/PhysRevLett.96.110405)
Effective theory of topological order
• Quantum field theory and the Jones polynomial, E. Witten, Comm. Math. Phys., 121, 351 (1989)
Mechanism of topological order
• Michael Levin, Xiao-Gang Wen, Phys. Rev. B, 71, 0451 10 (2005), String-net condensation: A physical
mechanism for topological phases,
• Chamon, C, Phys. Rev. Lett. 94, 040402 (2005), Quantum Glassiness in Strongly Correlated Clean Systems: An
Example of Topological Overprotection (http://link.aps.org/doi/10.1103/PhysRevLett.94.040402)
• Alioscia Hamma, Paolo Zanardi, Xiao-Gang Wen, Phys.Rev. B72 035307 (2005), String and Membrane
condensation on 3D lattices
• H. Bombin, M.A. Martin-Delgado, cond-mat/0607736, Exact Topological Quantum Order in D=3 and Beyond:
Branyons and Brane-Net Condensates
Quantum Algebraic Topology
267
Quantum computing
• Chetan Nayak, Steven H. Simon (http://www-thphys.physics.ox.ac.uk/people/SteveSimon/), Ady Stern,
Michael Freedman, Sankar Das Sarma, http://www.arxiv.org/abs/0707.1889, 2007, "Non-Abelian Anyons
and Topological Quantum Computation"
• A. Yu. Kitaev, Ann. Phys. (N.Y.), 303, 1 (2003), Fault-tolerant quantum computation by anyons
• Michael H. Freedman, Alexei Kitaev, Michael J. Larsen, and Zhenghan Wang, Bull. Amer. Math. Soc, 40, 31
(2003), "Topological quantum computation"
• Eric Dennis, Alexei Kitaev, Andrew Landahl, and John Preskill, J. Math. Phys., 43, 4452 (2002), Topological
quantum memory
• Ady Stern and Bertrand I. Halperin, Phys. Rev. Lett., 96, 016802 (2006), Proposed Experiments to probe the
Non-Abelian nu=5/2 Quantum Hall State
Emergence of elementary particles
• Xiao-Gang Wen, Phys. Rev. D68, 024501 (2003), Quantum order from string-net condensations and origin of
light and massless fermions
• M. Levin and Xiao-Gang Wen, Fermions, strings, and gauge fields in lattice spin models., Phys. Rev. B 67,
245316, (2003).
• M. Levin and Xiao-Gang Wen, Colloquium: Photons and electrons as emergent phenomena, Rev. Mod. Phys. 77,
Nu 12:19, 9 April 2009 (UTQ871 (2005), 4 pages; also, Quantum ether: Photons and electrons from a rotor
model, arXiv:hep-th/05071 18,2007.
• Zheng-Cheng Gu and Xiao-Gang Wen, gr-qc/0606100, A lattice bosonic model as a quantum theory of gravity,
Quantum operator algebra
• Yetter D.N., TQFTs from homotopy 2-types, /. Knot Theory 2 (1993), 1 13— 123.
• Landsman N. P. and Ramazan B., Quantization of Poisson algebras associated to Lie algebroids, in Proc. Conf. on
Groupoids in Physics, Analysis and Geometry(Bou\der CO, 1999)', Editors J. Kaminker et al.,159{ 192 Contemp.
Math. 282, Amer. Math. Soc, Providence RI, 2001, (also math{ph/001005.)
• Non-Abelian Quantum Algebraic Topology (NAQAT) 20 Nov. (2008),87 pages, Baianu, I.C. (http://
planetphy sics . org/?op=getobj &from=lec&id=6 1 )
• Levin A. and Olshanetsky M., Hamiltonian Algebroids and deformations of complex structures on Riemann
curves, hep-th/0301078vl.
• Xiao-Gang Wen, Yong-Shi Wu and Y. Hatsugai., Chiral operator product algebra and edge excitations of a FQH
droplet (pdf), Nucl. Phys. B422, 476 (1994): Used chiral operator product algebra to construct the bulk wave
function, characterize the topological orders and calculate the edge states for some non-Abelian FQH states.
• Xiao-Gang Wen and Yong-Shi Wu., Chiral operator product algebra hidden in certain FQH states (pdf), Nucl.
Phys. B419, 455 (1994): Demonstrated that non-Abelian topological orders are closely related to chiral operator
product algebra (instead of conformal field theory).
• Non-Abelian theory. (http://planetphysics.org/encyclopedia/NonAbelianTheory.html)
• R. Brown et al. A Non-Abelian, Categorical Ontology of Spacetimes and Quantum Gravity., Axiomathes, Volume
17, Numbers 3-4 / December, (2007), pages 353— 408., Springer, Netherlands, ISSN 1122-1151 (Print) 1572-8390
(Online), doi: 10. 1007/sl05 16-007-9012-1 .
• Ronald Brown, Higgins, P. J. and R. Sivera,:(2009), Nonabelian Algebraic Topology., vols.l and 2, Ch. U. Press,
in press, (http ://www.bangor. ac.uk/~mas0 1 0/nonab-t/partI0 1 0604.pdf)
• A Bibliography for Categories and Algebraic Topology Applications in Theoretical Physics (http://
planetphy sics . org/encyclopedia/
BibliographyForCategoryTheoryAndAlgebraicTopologyApplicationsInTheoreticalPhysics.html)
Quantum Algebraic Topology
268
• Quantum Algebraic Topology (QAT) (http://planetphysics.org/encyclopedia/
QuantumAlgebraicTopologyTopics.html)
Commutativity
In mathematics an operation is commutative if
changing the order of the operands does not
change the end result. It is a fundamental
property of many binary operations, and many
mathematical proofs depend on it. The
commutativity of simple operations, such as
multiplicaton and addition of numbers, was for
many years implicitly assumed and the property
was not named until the 19th century when
mathematics started to become formalized. By
contrast, division and subtraction are not
commutative.
Common uses
The commutative property (or commutative law) is a property associated with binary operations and functions.
Similarly, if the commutative property holds for a pair of elements under a certain binary operation then it is said that
the two elements commute under that operation.
In group and set theory, many algebraic structures are called commutative when certain operands satisfy the
commutative property. In higher branches of mathematics, such as analysis and linear algebra the commutativity of
well known operations (such as addition and multiplication on real and complex numbers) is often used (or implicitly
A \ • -p [1] [2] [3]
assumed) m proors.
Mathematical definitions
The term "commutative" is used in several related senses ^
1. A binary operation * on a set S is said to be commutative if:
Vx, y£S:x*y = y*x
- An operation that does not satisfy the above property is called noncommutative.
2. One says that x commutes with y under * if:
x * y = y * x
3. A binary function f.AxA — > B is said to be commutative if:
Vx,y G A : f(z,y) = f(y,x)
Example showing the commutativity of addition (3 + 2 = 2 + 3)
Commutativity
269
History and etymology
f ct f j sont telles qu'elles dcmnent
el que soit Fordre dans lequel on les>
ppelees commutatives entre elles*
; aRz=JLaz ;
The first known use of the term was in a French Journal published in
1814
Records of the implicit use of the commutative
property go back to ancient times. The Egyptians used
the commutative property of multiplication to simplify
computing products ^ Euclid is known to have
assumed the commutative property of multiplication in
roi
his book Elements. Formal uses of the commutative
property arose in the late 18th and early 19th centuries,
when mathematicians began to work on a theory of
functions. Today the commutative property is a well
known and basic property used in most branches of
mathematics.
The first recorded use of the term commutative was in a memoir by Francois Servois in 18 14,^ ^ which used the
word commutatives when describing functions that have what is now called the commutative property. The word is a
combination of the French word commuter meaning "to substitute or switch" and the suffix -ative meaning "tending
to" so the word literally means "tending to substitute or switch." The term then appeared in English in Philosophical
Transactions of the Royal Society in 1844
[9]
Related properties
Associativity
The associative property is closely related to the commutative
property. The associative property of an expression containing two
or more occurrences of the same operator states that the order in
which operations are performed does not affect the final result, as
long as the order of terms is not changed. In contrast, the
commutative property states that the order of the terms does not
affect the final result.
Graph showing the symmetry of the addition function
Symmetry can be directly linked to commutativity. When a
commutative operator is written as a binary function then the resulting function is symmetric across the line y = x.
As an example, if we let a function / represent addition (a commutative operation) so that f(x,y) = x + y then / is a
symmetric function which can be seen in the image on the right.
For binary relations, a symmetric relation is analogous to a commutative operation, in that if a relation R is
symmetric, then a Rb <=> bRa •
Commutativity
270
Examples
Commutative operations in everyday life
• Putting on socks resembles a commutative operation, since which sock is put on first is unimportant. Either way,
the end result (having both socks on), is the same.
• The commutativity of addition is observed when paying for an item with cash. Regardless of the order in which
the bills handed over, they always give the same total.
Commutative operations in mathematics
Two well-known examples of commutative binary operations are:^
• The addition of real numbers, which is commutative since
V(y, z) £R:y + z = z + y
For example 4 + 5 = 5 + 4, since both expressions equal 9.
• The multiplication of real numbers, which is commutative since
Vy, z G R : yz = zy
For example, 3x5 = 5x3, since both expressions equal 15.
• Further examples of commutative binary operations include addition and multiplication of complex numbers,
addition and scalar multiplication of vectors, and intersection and union of sets.
Noncommutative operations in everyday life
• Concatenation, the act of joining character strings together, is a noncommutative operation. For example
EA + T = EAT / TEA = T + EA
• Washing and drying clothes resembles a noncommutative operation; washing and then drying produces a
markedly different result to drying and then washing.
• The twists of the Rubik's Cube are noncommutative. This is studied in group theory.
Noncommutative operations in mathematics
• Subtraction is noncommutative since 0 — 1^1 — 0
• Division is noncommutative since 1/2 ^ 2/1
• Infinite addition is not (necessarily) commutative:
1-1 + 1-1 + 1-1 + 1-1 + .. .<1
whereas
1 + 1- 1 + 1 + 1 -1 + 1 + 1- l + ... = oc
• Matrix multiplication is noncommutative since
0
2
1
1
0
1
0
1
1
1
0 1
0
1
0
1
0
1
0
1
0
1
0 1
• The vector product (or cross product) of two vectors in three dimensions is anti-commutative, i.e., b x a = (a x
b).
Some noncommutative binary operations are:^ 11]
Commutativity
271
Mathematical structures and commutativity
• A commutative semigroup is a set endowed with a total, associative and commutative operation.
• If the operation additionally has an identity element, we have a commutative monoid
T21
• An abelian group, or commutative group is a group whose group operation is commutative.
• A commutative ring is a ring whose multiplication is commutative. (Addition in a ring is always
commutative y 1 2 ^
ri3i
• In a field both addition and multiplication are commutative.
Non-commuting operators in quantum mechanics
In quantum mechanics as formulated by Schrodinger, physical variables are represented by linear operators such as x
(meaning multiply by x), and d/dx. These two operators do not commute as may be seen by considering the effect of
their products x (d/dx) and (d/dx) x on a one-dimensional wave function ip(x):
d d
x—ij) = xip 7^ ——xip = ip + xip
dx dx
According to the uncertainty principle of Heisenberg, if the two operators representing a pair of variables do not
commute, then that pair of variables are mutually complementary which means that they cannot be simultaneously
measured or known precisely. For example, the position and the linear momentum of a particle are represented
respectively (in the x-direction) by the operators x and (h/2jti)d/dx (where h is Planck's constant). This is the same
example except for the constant (h/2jti), so again the operators do not commute and the physical meaning is that the
position and linear momentum in a given direction are complementary.
Notes
[I] Axler,p.2
[2] Gallian,p.34
[3] p. 26,87
[4] Krowne, p. 1
[5] Weisstein, Commute, p.l
[6] Lumpkin, p. 11
[7] Gay and Shute, p.?
[8] O'Conner and Robertson, Real Numbers
[9] Cabillon and Miller, Commutative and Distributive
[10] O'Conner and Robertson, Servois
[II] Yark,p.l
[12] Gallianp.236
[13] Gallianp.250
References
Books
• Axler, Sheldon (1997). Linear Algebra Done Right, 2e. Springer. ISBN 0-387-98258-2.
Abstract algebra theory. Covers commutativity in that context. Uses property throughout book.
• Goodman, Frederick (2003). Algebra: Abstract and Concrete, Stressing Symmetry, 2e. Prentice Hall.
ISBN 0-13-067342-0.
Abstract algebra theory. Uses commutativity property throughout book.
• Gallian, Joseph (2006). Contemporary Abstract Algebra, 6e. Boston, Mass.: Houghton Mifflin.
ISBN 0-618-51471-6.
Linear algebra theory. Explains commutativity in chapter 1, uses it throughout.
Commutativity
272
Articles
• http://www.ethnomath.org/resources/lumpkinl997.pdf Lumpkin, B. (1997). The Mathematical Legacy Of
Ancient Egypt - A Response To Robert Palter. Unpublished manuscript.
Article describing the mathematical ability of ancient civilizations.
• Robins, R. Gay, and Charles C. D. Shute. 1987. The Rhind Mathematical Papyrus: An Ancient Egyptian Text.
London: British Museum Publications Limited. ISBN 0-7141-0944-4
Translation and interpretation of the Rhind Mathematical Papyrus.
Online resources
• Krowne, Aaron, Commutative (http://planetmath.org/encyclopedia/Commutative.html) at PlanetMath.,
Accessed 8 August 2007.
Definition of commutativity and examples of commutative operations
• Weisstein, Eric W., " Commute (http://mathworld.wolfram.com/Commute.html)" from MathWorld., Accessed
8 August 2007.
Explanation of the term commute
• Yark (http://planetmath.org/?op=getuser&id=2760). Examples of non- commutative operations (http://
planetmath.org/encyclopedia/ExampleOfCommutative.html) at PlanetMath., Accessed 8 August 2007
Examples proving some noncommutative operations
• O'Conner, J J and Robertson, E F. MacTutor history of real numbers (http://www-history.mcs.st-andrews.ac.
uk/HistTopics/Real_numbers_l.html), Accessed 8 August 2007
Article giving the history of the real numbers
• Cabillon, Julio and Miller, Jeff. Earliest Known Uses Of Mathematical Terms (http://jeff560.tripod.eom/c.
html), Accessed 22 November 2008
Page covering the earliest uses of mathematical terms
• O'Conner, J J and Robertson, E F. MacTutor biography of Francois Servois (http://www-groups.dcs.st-and.ac.
uk/~history/Biographies/Servois.html), Accessed 8 August 2007
Biography of Francois Servois, who first used the term
Noncommutative quantum field theory
273
Noncommutative quantum field theory
In mathematical physics, noncommutative quantum field theory (or quantum field theory on noncommutative
spacetime) is an application of noncommutative mathematics to the spacetime of quantum field theory that is an
outgrowth of noncommutative geometry and index theory in which the coordinate functions 1 ^ are noncommutative.
One commonly studied version of such theories has the "canonical" commutation relation:
which means that (with any given set of axes), it is impossible to accurately measure the position of a particle with
respect to more than one axis. In fact, this leads to an uncertainty relation for the coordinates analogous to the
Heisenberg uncertainty principle.
Various lower limits have been claimed for the noncommutative scale, (i.e. how accurately positions can be
measured) but there is currently no experimental evidence in favour of such theory or grounds for ruling them out.
One of the novel features of noncommutative field theories is the UV/IR mixing phenomenon in which the physics
at high energies affects the physics at low energies which does not occur in quantum field theories in which the
coordinates commute.
Other features include violation of Lorentz in variance due to the preferred direction of noncommutativity.
Relativistic in variance can however be retained in the sense of twisted Poincare in variance of the theory . The
causality condition is modified from that of the commutative theories.
History and motivation
Heisenberg was the first to suggest extending noncommutativity to the coordinates as a possible way of removing the
infinite quantities appearing in field theories before the renormalization procedure was developed and had gained
acceptance. The first paper on the subject was published in 1947 by Hartland Snyder. The success of the
renormalization method resulted in little attention being paid to the subject for some time. In the 1980s,
mathematicians, most notably Alain Connes, developed noncommutative geometry. Among other things, this work
generalized the notion of differential structure to a noncommutative setting. This led to an operator algebraic
description of noncommutative space-times, and the development of a Yang-Mills theory on a noncommutative
torus.
The particle physics community became interested in the noncommutative approach because of a paper by Nathan
Seiberg and Edward Witten.^ They argued in the context of string theory that the coordinate functions of the
endpoints of open strings constrained to a D-brane in the presence of a constant Neveu- Schwartz B-field -
equivalent to a constant magnetic field on the brane — would satisfy the noncommutative algebra set out above. The
implication is that a quantum field theory on noncommutative spacetime can be interpreted as a low energy limit of
the theory of open strings.
A paper by Sergio Doplicher, Klaus Fredenhagen and John Roberts ^ set out another motivation for the possible
noncommutativity of space-time. Their arguments goes as follows: According to general relativity, when the energy
density grows sufficiently large, a black hole is formed. On the other hand according to the Heisenberg uncertainty
principle, a measurement of a space-time separation causes an uncertainty in momentum inversely proportional to
the extent of the separation. Thus energy whose scale corresponds to the uncertainty in momentum is localized in the
system within a region corresponding to the uncertainty in position. When the separation is small enough, the
Schwarzschild radius of the system is reached and a black hole is formed, which prevents any information from
escaping the system. Thus there is a lower bound for the measurement of length. A sufficient condition for
preventing gravitational collapse can be expressed as an uncertainty relation for the coordinates. This relation can in
turn be derived from a commutation relation for the coordinates.
Noncommutative quantum field theory
274
Footnotes
[1] It is possible to have a noncommuting time coordinate, but this causes many problems such as the violation of unitarity of the S-matrix.
Hence most research is restricted to so-called "space-space" noncommutativity. There have been attempts to avoid these problems by
redefining the perturbation theory. However, string theory derivations of noncommutative coordinates excludes time-space noncommutativity.
[2] See, for example, Shiraz Minwalla, Mark Van Raamsdonk, Nathan Seiberg (2000) " Noncommutative Perturbative Dynamics, (http://arxiv.
org/abs/hep-th/99 12072)" Journal of High Energy Physics, and Alec Matusis, Leonard Susskind, Nicolaos Toumbas (2000) " The IR/UV
Connection in the Non- Commutative Gauge Theories, (http://arxiv.org/abs/hep-th/0002075)" Journal of High Energy Physics.
[3] M. Chaichian, P. Presnajder, A. Tureanu (2005) " New concept of relativistic invariance in NC space-time: twisted Poincare symmetry and its
implications, (http://arxiv.org/abs/hep-th/0409096)" Phys. Rev. Letters 94: .
[4] Seiberg, N. and E. Witten (1999) " String Theory and Noncommutative Geometry, (http://arxiv.org/abs/hep-th/9908142)" Journal of High
Energy Physics .
[5] Sergio Doplicher, Klaus Fredenhagen, John E. Roberts (1995) " The quantum structure of spacetime at the Planck scale and quantum fields,
(http://arxiv.org/abs/hep-th/0303037)" Commun. Math. Phys. 172: 187-220.
Further reading
• M.R. Douglas and N. A. Nekrasov (2001) " Noncommutative field theory, (http://prola.aps.org/abstract/RMP/
v73/i4/p977_l?qid=a81527af6e5a2fa2&qseq=l&show=10) M Rev. Mod. Phys. 73: 977 - 1029.
• Szabo, R. J. (2003) " Quantum Field Theory on Noncommutative Spaces, (http://arxiv.org/abs/hep-th/
0109162)" Physics Reports 378: 207-99. An expository article on noncommutative quantum field theories.
• Noncommutative quantum field theory, see statistics (http://xstructure.inr.ac. ru/x-bin/theme3.py?level=2&
indexl=-173391) on arxiv.org
Noncommutative standard model
In theoretical particle physics, the non-commutative Standard Model, mainly due to the French mathematician
Alain Connes, uses his noncommutative geometry to devise an extension of the Standard Model to include a
modified form of general relativity. This unification implies a few constraints on the parameters of the Standard
Model. Under an additional assumption, known as the "big desert" hypothesis, one of these constraints determines
the mass of the Higgs boson to be around 170 GeV, comfortably within the range of the Large Hadron Collider.
Recent Tevatron experiments exclude a Higgs mass of 158 to 175 GeV at the 95% confidence level J 1 ^
Background
Current physical theory features four elementary forces: the gravitational force, the electromagnetic force, the weak
force, and the strong force. Gravity has an elegant and experimentally precise theory: Einstein's general relativity. It
is based on Riemannian geometry and interprets the gravitational force as curvature of space-time. Its Lagrangian
formulation requires only two empirical parameters, the gravitational constant and the cosmological constant.
The other three forces also have a Lagrangian theory, called the Standard Model. Its underlying idea is that they are
mediated by the exchange of spin-1 particles, the so-called gauge bosons. The one responsible for electromagnetism
is the photon. The weak force is mediated by the W and Z bosons; the strong force, by gluons. The gauge Lagrangian
is much more complicated than the gravitational one: at present, it involves some 30 real parameters, a number that
could increase. What is more, the gauge Lagrangian must also contain a spin 0 particle, the Higgs boson, to give
mass to the spin 1/2 and spin 1 particles. This particle has yet to be observed, and if it is not detected at the Large
Hadron Collider in Geneva, the consistency of the Standard Model is in doubt.
Alain Connes has generalized Bernhard Riemann's geometry to noncommutative geometry. It describes spaces with
curvature and uncertainty. Historically, the first example of such a geometry is quantum mechanics, which
introduced Heisenberg's uncertainty relation by turning the classical observables of position and momentum into
noncommuting operators. Noncommutative geometry is still sufficiently similar to Riemannian geometry that
Noncommutative standard model
275
Connes was able to rederive general relativity. In doing so, he obtained the gauge Lagrangian as a companion of the
gravitational one, a truly geometric unification of all four fundamental interactions. Connes has thus devised a fully
geometric formulation of the Standard Model, where all the parameters are geometric invariants of a
noncommutative space. A result is that parameters like the electron mass are now analogous to purely mathematical
constants like pi.
Notes
[1] The TEVNPH Working Group (http://arxiv.org/abs/1007.4587)
References
• Alain Connes (1994) Noncommutative geometry, (http://www.alainconnes.org/docs/book94bigpdf.pdf)
Academic Press. ISBN 0-12-185860-X.
• (1995) "Noncommutative geometry and reality," /. Math. Phys. 36: 6194.
• (1996) " Gravity coupled with matter and the foundation of noncommutative geometry, (http://arxiv.org/
abs/hep-th/9603053)" Comm. Math. Phys. 155: 109.
• (2006) " Noncommutative geometry and physics, (http://www.alainconnes.org/docs/einsymp.pdf)"
• and M. Marcolli, Noncommutative Geometry: Quantum Fields and Motives. (http://www.alainconnes.
org/en/downloads.php) American Mathematical Society (2007).
• Chamseddine, A., A. Connes (1996) " The spectral action principle, (http://arxiv.org/abs/hep-th/9606001)"
Comm. Math. Phys. 182: 155.
• Chamseddine, A., A. Connes, M. Marcolli (2007) " Gravity and the Standard Model with neutrino mixing, (http:/
/arxiv.org/abs/hep-th/0610241)" Adv. Theor. Math. Phys. 11: 991.
• Jureit, Jan-H., Thomas Krajewski, Thomas Schucker, and Christoph A. Stephan (2007) " On the noncommutative
standard model, (http://arxiv.org/abs/0705.0489)" Acta Phys. Polon. B38: 3181-3202.
• Schucker, Thomas (2005) Forces from Connes's geometry, (http://arxiv.org/abs/hep-th/0111236) Lecture
Notes in Physics 659, Springer.
External links
• Alain Connes official website (http://www.alainconnes.org/) with downloadable papers. (http://www.
alainconnes . org/en/do wnloads .php)
• Alain Connes's Standard Model, (http://resonaances.blogspot.com/2007/02/alain-connes-standard-model.
html)
Nonabelian Gauge Theory
276
Nonabelian Gauge Theory
In physics, a gauge theory is a type of field theory in which the Lagrangian is invariant under a continuous group of
local transformations.
The term gauge refers to redundant degrees of freedom in the Lagrangian. The transformations between possible
gauges, called gauge transformations, form a Lie group which is referred to as the symmetry group or the gauge
group of the theory. Associated with any Lie group is the Lie algebra of group generators. For each group generator
there necessarily arises a corresponding vector field called the gauge field. Gauge fields are included in the
Lagrangian to ensure its in variance under the local group transformations (called gauge invariance). When such a
theory is quantized, the quanta of the gauge fields are called gauge bosons. If the symmetry group is
non-commutative, the gauge theory is referred to as non-abelian, the usual example being the Yang-Mills theory.
Gauge theories are important as the successful field theories explaining the dynamics of elementary particles.
Quantum electrodynamics is an abelian gauge theory with the symmetry group U(l) and has one gauge field, the
electromagnetic field, with the photon being the gauge boson. The Standard Model is a non-abelian gauge theory
with the symmetry group U(l)xSU(2)xSU(3) and has a total of twelve gauge bosons: the photon, three weak bosons
and eight gluons.
Many powerful theories in physics are described by Lagrangians which are invariant under some symmetry
transformation groups. When they are invariant under a transformation identically performed at every point in the
space in which the physical processes occur, they are said to have a global symmetry. The requirement of local
symmetry, the cornerstone of gauge theories, is a stricter constraint. In fact, a global symmetry is just a local
symmetry whose group's parameters are fixed in space-time. Gauge symmetries can be viewed as analogues of the
equivalence principle of general relativity in which each point in spacetime is allowed a choice of local reference
(coordinate) frame. Both symmetries reflect a redundancy in the description of a system.
Historically, these ideas were first stated in the context of classical electromagnetism and later in general relativity.
However, the modern importance of gauge symmetries appeared first in the relativistic quantum mechanics of
electrons — quantum electrodynamics, elaborated on below. Today, gauge theories are useful in condensed matter,
nuclear and high energy physics among other subfields.
History and importance
The earliest field theory having a gauge symmetry was Maxwell's formulation of electrodynamics in 1864. The
importance of this symmetry remained unnoticed in the earliest formulations. Similarly unnoticed, Hilbert had
derived the Einstein field equations by postulating the invariance of the action under a general coordinate
transformation. Later Hermann Weyl, in an attempt to unify general relativity and electromagnetism, conjectured
(incorrectly, as it turned out) that Eichinvarianz or invariance under the change of scale (or "gauge") might also be a
local symmetry of general relativity. After the development of quantum mechanics, Weyl, Vladimir Fock and Fritz
London modified gauge by replacing the scale factor with a complex quantity and turned the scale transformation
into a change of phase — aU(l) gauge symmetry. This explained the electromagnetic field effect on the wave
function of a charged quantum mechanical particle. This was the first widely recognised gauge theory, popularised
by Pauli in the 1940s. [1]
In 1954, attempting to resolve some of the great confusion in elementary particle physics, Chen Ning Yang and
Robert Mills introduced non-abelian gauge theories as models to understand the strong interaction holding together
nucleons in atomic nuclei. (Ronald Shaw, working under Abdus Salam, independently introduced the same notion in
his doctoral thesis.) Generalizing the gauge invariance of electromagnetism, they attempted to construct a theory
based on the action of the (non-abelian) SU(2) symmetry group on the isospin doublet of protons and neutrons. This
is similar to the action of the U(l) group on the spinor fields of quantum electrodynamics. In particle physics the
Nonabelian Gauge Theory
277
emphasis was on using quantized gauge theories.
This idea later found application in the quantum field theory of the weak force, and its unification with
electromagnetism in the electroweak theory. Gauge theories became even more attractive when it was realized that
non-abelian gauge theories reproduced a feature called asymptotic freedom. Asymptotic freedom was believed to be
an important characteristic of strong interactions. This motivated searching for a strong force gauge theory. This
theory, now known as quantum chromodynamics, is a gauge theory with the action of the SU(3) group on the color
triplet of quarks. The Standard Model unifies the description of electromagnetism, weak interactions and strong
interactions in the language of gauge theory.
In the 1970s, Sir Michael Atiyah began studying the mathematics of solutions to the classical Yang-Mills equations.
In 1983, Atiyah's student Simon Donaldson built on this work to show that the differentiable classification of smooth
4-manifolds is very different from their classification up to homeomorphism. Michael Freedman used Donaldson's
work to exhibit exotic R 4 s, that is, exotic differentiable structures on Euclidean 4-dimensional space. This led to an
increasing interest in gauge theory for its own sake, independent of its successes in fundamental physics. In 1994,
Edward Witten and Nathan Seiberg invented gauge-theoretic techniques based on supersymmetry which enabled the
calculation of certain topological invariants. These contributions to mathematics from gauge theory have led to a
renewed interest in this area.
The importance of gauge theories for physics stems from the tremendous success of the mathematical formalism in
providing a unified framework to describe the quantum field theories of electromagnetism, the weak force and the
strong force. This theory, known as the Standard Model, accurately describes experimental predictions regarding
three of the four fundamental forces of nature, and is a gauge theory with the gauge group SU(3) x SU(2) x U(l).
Modern theories like string theory, as well as some formulations of general relativity, are, in one way or another,
gauge theories.
Hi
See Pickering for more about the history of gauge and quantum field theories.
Description
Global and local symmetries
In physics, the mathematical description of any physical situation usually contains excess degrees of freedom; the
same physical situation is equally well described by many equivalent mathematical configurations. For instance, in
Newtonian dynamics, if two configurations are related by a Galilean transformation — an inertial change of reference
frame — they represent the same physical situation. These transformations form a group of "symmetries" of the
theory, and a physical situation corresponds not to an individual mathematical configuration but to a class of
configurations related to one another by this symmetry group. This idea can be generalized to include local as well as
global symmetries, analogous to much more abstract "changes of coordinates" in a situation where there is no
preferred "inertial" coordinate system that covers the entire physical system. A gauge theory is a mathematical
model that has symmetries of this kind, together with a set of techniques for making physical predictions consistent
with the symmetries of the model.
Example of global symmetry
When a quantity occurring in the mathematical configuration is not just a number but has some geometrical
significance, such as a velocity or an axis of rotation, its representation as numbers arranged in a vector or matrix is
also changed by a coordinate transformation. For instance, if one description of a pattern of fluid flow states that the
fluid velocity in the neighborhood of (x=l, y=0) is 1 m/s in the positive x direction, then a description of the same
situation in which the coordinate system has been rotated clockwise by 90 degrees will state that the fluid velocity in
the neighborhood of (x=0, y=l) is 1 m/s in the positive y direction. The coordinate transformation has affected both
the coordinate system used to identify the location of the measurement and the basis in which its value is expressed.
Nonabelian Gauge Theory
278
As long as this transformation is performed globally (affecting the coordinate basis in the same way at every point),
the effect on values that represent the rate of change of some quantity along some path in space and time as it passes
through point P is the same as the effect on values that are truly local to P.
Use of fiber bundles to describe local symmetries
In order to adequately describe physical situations in more complex theories, it is often necessary to introduce a
"coordinate basis" for some of the objects of the theory that do not have this simple relationship to the coordinates
used to label points in space and time. (In mathematical terms, the theory involves a fiber bundle in which the fiber
at each point of the base space consists of possible coordinate bases for use when describing the values of objects at
that point.) In order to spell out a mathematical configuration, one must choose a particular coordinate basis at each
point (a local section of the fiber bundle) and express the values of the objects of the theory (usually "fields" in the
physicist's sense) using this basis. Two such mathematical configurations are equivalent (describe the same physical
situation) if they are related by a transformation of this abstract coordinate basis (a change of local section, or gauge
transformation) .
In most gauge theories, the set of possible transformations of the abstract gauge basis at an individual point in space
and time is a finite-dimensional Lie group. The simplest such group is U(l), which appears in the modern
formulation of quantum electrodynamics (QED) via its use of complex numbers. QED is generally regarded as the
first, and simplest, physical gauge theory. The set of possible gauge transformations of the entire configuration of a
given gauge theory also forms a group, the gauge group of the theory. An element of the gauge group can be
parameterized by a smoothly varying function from the points of spacetime to the (finite-dimensional) Lie group,
whose value at each point represents the action of the gauge transformation on the fiber over that point.
A gauge transformation with constant parameter at every point in space and time is analogous to a rigid rotation of
the geometric coordinate system; it represents a global symmetry of the gauge representation. As in the case of a
rigid rotation, this gauge transformation affects expressions that represent the rate of change along a path of some
gauge-dependent quantity in the same way as those that represent a truly local quantity. A gauge transformation
whose parameter is not a constant function is referred to as a local symmetry; its effect on expressions that involve a
derivative is qualitatively different from that on expressions that don't. (This is analogous to a non-inertial change of
reference frame, which can produce a Coriolis effect.)
Gauge fields
The "gauge covariant" version of a gauge theory accounts for this effect by introducing a gauge field (in
mathematical language, an Ehresmann connection) and formulating all rates of change in terms of the covariant
derivative with respect to this connection. The gauge field becomes an essential part of the description of a
mathematical configuration. A configuration in which the gauge field can be eliminated by a gauge transformation
has the property that its field strength (in mathematical language, its curvature) is zero everywhere; a gauge theory is
not limited to these configurations. In other words, the distinguishing characteristic of a gauge theory is that the
gauge field does not merely compensate for a poor choice of coordinate system; there is generally no gauge
transformation that makes the gauge field vanish.
When analyzing the dynamics of a gauge theory, the gauge field must be treated as a dynamical variable, similarly to
other objects in the description of a physical situation. In addition to its interaction with other objects via the
covariant derivative, the gauge field typically contributes energy in the form of a "self-energy" term. One can obtain
the equations for the gauge theory by:
• starting from a naive ansatz without the gauge field (in which the derivatives appear in a "bare" form);
• listing those global symmetries of the theory that can be characterized by a continuous parameter (generally an
abstract equivalent of a rotation angle);
• computing the correction terms that result from allowing the symmetry parameter to vary from place to place; and
Nonabelian Gauge Theory
279
• reinterpreting these correction terms as couplings to one or more gauge fields, and giving these fields appropriate
self-energy terms and dynamical behavior.
This is the sense in which a gauge theory "extends" a global symmetry to a local symmetry, and closely resembles
the historical development of the gauge theory of gravity known as general relativity.
Physical experiments
Gauge theories are used to model the results of physical experiments, essentially by:
• limiting the universe of possible configurations to those consistent with the information used to set up the
experiment, and then
• computing the probability distribution of the possible outcomes that the experiment is designed to measure.
The mathematical descriptions of the "setup information" and the "possible measurement outcomes" (loosely
speaking, the "boundary conditions" of the experiment) are generally not expressible without reference to a particular
coordinate system, including a choice of gauge. (If nothing else, one assumes that the experiment has been
adequately isolated from "external" influence, which is itself a gauge-dependent statement.) Mishandling gauge
dependence in boundary conditions is a frequent source of anomalies in gauge theory calculations, and gauge
theories can be broadly classified by their approaches to anomaly avoidance.
Continuum theories
The two gauge theories mentioned above (continuum electrodynamics and general relativity) are examples of
continuum field theories. The techniques of calculation in a continuum theory implicitly assume that:
• given a completely fixed choice of gauge, the boundary conditions of an individual configuration can in principle
be completely described;
• given a completely fixed gauge and a complete set of boundary conditions, the principle of least action determines
a unique mathematical configuration (and therefore a unique physical situation) consistent with these bounds;
• the likelihood of possible measurement outcomes can be determined by:
• establishing a probability distribution over all physical situations determined by boundary conditions that are
consistent with the setup information,
• establishing a probability distribution of measurement outcomes for each possible physical situation, and
• convolving these two probability distributions to get a distribution of possible measurement outcomes
consistent with the setup information; and
• fixing the gauge introduces no anomalies in the calculation, due either to gauge dependence in describing partial
information about boundary conditions or to incompleteness of the theory.
These assumptions are close enough to valid, across a wide range of energy scales and experimental conditions, to
allow these theories to make accurate predictions about almost all of the phenomena encountered in daily life, from
light, heat, and electricity to eclipses and spaceflight. They fail only at the smallest and largest scales (due to
omissions in the theories themselves) and when the mathematical techniques themselves break down (most notably
in the case of turbulence and other chaotic phenomena).
Quantum field theories
Other than these "classical" continuum field theories, the most widely known gauge theories are quantum field
theories, including quantum electrodynamics and the Standard Model of elementary particle physics. The starting
point of a quantum field theory is much like that of its continuum analog: a gauge-co variant action integral which
characterizes "allowable" physical situations according to the principle of least action. However, continuum and
quantum theories differ significantly in how they handle the excess degrees of freedom represented by gauge
transformations. Continuum theories, and most pedagogical treatments of the simplest quantum field theories, use a
Nonabelian Gauge Theory
280
gauge fixing prescription to reduce the orbit of mathematical configurations that represent a given physical situation
to a smaller orbit related by a smaller gauge group (the global symmetry group, or perhaps even the trivial group).
More sophisticated quantum field theories, in particular those which involve a non-abelian gauge group, break the
gauge symmetry within the techniques of perturbation theory by introducing additional fields (the Faddeev-Popov
ghosts) and counterterms motivated by anomaly cancellation, in an approach known as BRST quantization. While
these concerns are in one sense highly technical, they are also closely related to the nature of measurement, the limits
on knowledge of a physical situation, and the interactions between incompletely specified experimental conditions
and incompletely understood physical theory. The mathematical techniques that have been developed in order to
make gauge theories tractable have found many other applications, from solid-state physics and crystallography to
low-dimensional topology.
Classical gauge theory
Classical electromagnetism
Historically, the first example of gauge symmetry to be discovered was classical electromagnetism. In static
electricity, one can either discuss the electric field, E, or its corresponding electric potential, V. Knowledge of one
makes it possible to find the other, except that potentials differing by a constant, V — > V + C , correspond to the
same electric field. This is because the electric field relates to changes in the potential from one point in space to
another, and the constant C would cancel out when subtracting to find the change in potential. In terms of vector
calculus, the electric field is the gradient of the potential, E = — WV- Generalizing from static electricity to
electromagnetism, we have a second potential, the vector potential A, with
where / is any function that depends on position and time. The fields remain the same under the gauge
transformation, and therefore Maxwell's equations are still satisfied. That is, Maxwell's equations have a gauge
symmetry.
An example: Scalar 0(#i) gauge theory
The remainder of this section requires some familiarity with classical or quantum field theory, and the use of
Definitions in this section: gauge group, gauge field, interaction Lagrangian, gauge boson.
The following illustrates how local gauge invariance can be "motivated" heuristically starting from global symmetry
properties, and how it leads to an interaction between fields which were originally non-interacting.
Consider a set of n non-interacting scalar fields, with equal masses m. This system is described by an action which is
the sum of the (usual) action for each scalar field ip%
B = Vx A .
The general gauge transformations now become not just V — > V + C but
A + V/
Lagrangians.
The Lagrangian (density) can be compactly written as
Nonabelian Gauge Theory
281
by introducing a vector of fields
The term is Einstein notation for the partial derivative of $in each of the four dimensions. It is now transparent
that the Lagrangian is invariant under the transformation
$ h-> <J>' =
whenever G is a constant matrix belonging to the n-by-n orthogonal group 0(n). This is seen to preserve the
Lagrangian since the derivative of $will transform identically to $and both quantities appear inside dot products
in the Lagrangian (orthogonal transformations preserve the dot product).
This characterizes the global symmetry of this particular Lagrangian, and the symmetry group is often called the
gauge group; the mathematical term is structure group, especially in the theory of G-structures. Incidentally,
Noether's theorem implies that in variance under this group of transformations leads to the conservation of the
current
J* = id^ T T a ®
where the matrices are generators of the SO(^z) group. There is one conserved current for every generator.
Now, demanding that this Lagrangian should have local 0(n) -in variance requires that the G matrices (which were
earlier constant) should be allowed to become functions of the space-time coordinates x.
Unfortunately, the G matrices do not "pass through" the derivatives, when G = G(x),
The failure of the derivative to commute with "G" introduces an additional term (in keeping with the product rule)
which spoils the invariance of the Lagrangian. In order to rectify this we define a new derivative operator such that
the derivative of $will again transform identically with $
(Dp®)' = GDfl.
This new "derivative" is called a covariant derivative and takes the form
Dp = dp + gAp
Where g is called the coupling constant - a quantity defining the strength of an interaction. After a simple
calculation we can see that the gauge field A(x) must transform as follows
A'p = GApG -1 — \d ft G)G~ 1
The gauge field is an element of the Lie algebra, and can therefore be expanded as
a
There are therefore as many gauge fields as there are generators of the Lie algebra.
Finally, we now have a locally gauge invariant Lagrangian
Aoc = \(D^) T D^ - l -m 2 ® T ®.
Pauli calls gauge transformation of the first type to the one applied to fields as while the compensating
transformation in A is sa id to be a gauge transformation of the second type.
Nonabelian Gauge Theory
282
The difference between this Lagrangian and the original globally
gauge-invariant Lagrangian is seen to be the interaction
Lagrangian
Feynman diagram of scalar bosons interacting via a
gauge boson
Ant = !$ T A^>$ + |(d„$) T A"$ + ^-(A^fA^.
This term introduces interactions between the n scalar fields just as a consequence of the demand for local gauge
invariance. However, to make this interaction physical and not completely arbitrary, the mediator A(x) needs to
propagate in space. That is dealt with in the next section by adding yet another term, , to the Lagrangian. In the
quantized version of the obtained classical field theory, the quanta of the gauge field A(x) are called gauge bosons.
The interpretation of the interaction Lagrangian in quantum field theory is of scalar bosons interacting by the
exchange of these gauge bosons.
The Yang-Mills Lagrangian for the gauge field
The picture of a classical gauge theory developed in the previous section is almost complete, except for the fact that
to define the covariant derivatives D, one needs to know the value of the gauge field A(x)at all space-time points.
Instead of manually specifying the values of this field, it can be given as the solution to a field equation. Further
requiring that the Lagrangian which generates this field equation is locally gauge invariant as well, one possible form
for the gauge field Lagrangian is (conventionally) written as
£ gf = -^TV(F^)
with
F fU/ = ^[D fl ,D v ]
W
and the trace being taken over the vector space of the fields. This is called the Yang-Mills action. Other gauge
invariant actions also exist (e.g. nonlinear electrodynamics, Born-Infeld action, Chern-Simons model, theta term
etc.).
Note that in this Lagrangian term there is no field whose transformation counterweighs the one of A • Invariance of
this term under gauge transformations is a particular case of a priori classical (geometrical) symmetry. This
symmetry must be restricted in order to perform quantization, the procedure being denominated gauge fixing, but
even after restriction, gauge transformations may be possible. 1
The complete Lagrangian for the gauge theory is now
— ^loc H" ^gf — ^global ~T" ^int ~~T" ^gf
Nonabelian Gauge Theory
283
An example: Electrodynamics
As a simple application of the formalism developed in the previous sections, consider the case of electrodynamics,
with only the electron field. The bare-bones action which generates the electron field's Dirac equation is
The global symmetry for this system is
1p h-> e i0 ljj.
The gauge group here is U(l), just the phase angle of the field, with a constant 6.
"Localising this symmetry implies the replacement of 9 by 9(x).
An appropriate covariant derivative is then
Dp = dp - i—Ap.
Identifying the "charge" e with the usual electric charge (this is the origin of the usage of the term in gauge theories),
and the gauge field A(x) with the four- vector potential of electromagnetic field results in an interaction Lagrangian
Ant = ^(l)r^l)^(l) = J»{x)Ap( X ).
where J^(x)is the usual four vector electric current density. The gauge principle is therefore seen to naturally
introduce the so-called minimal coupling of the electromagnetic field to the electron field.
Adding a Lagrangian for the gauge field ^4 /x (x)in terms of the field strength tensor exactly as in electrodynamics,
one obtains the Lagrangian which is used as the starting point in quantum electrodynamics.
£ QED = ^{ihc-fD,. - mc 2 )ip - -^-F^F^.
See also: Dirac equation, Maxwell's equations, Quantum electrodynamics
Mathematical formalism
Gauge theories are usually discussed in the language of differential geometry. Mathematically, a gauge is just a
choice of a (local) section of some principal bundle. A gauge transformation is just a transformation between two
such sections.
Although gauge theory is dominated by the study of connections (primarily because it's mainly studied by
high-energy physicists), the idea of a connection is not central to gauge theory in general. In fact, a result in general
gauge theory shows that affine representations (i.e. affine modules) of the gauge transformations can be classified as
sections of a jet bundle satisfying certain properties. There are representations which transform covariantly pointwise
(called by physicists gauge transformations of the first kind), representations which transform as a connection form
(called by physicists gauge transformations of the second kind, an affine representation) and other more general
representations, such as the B field in BF theory. There are more general nonlinear representations (realizations), but
are extremely complicated. Still, nonlinear sigma models transform nonlinearly, so there are applications.
If there is a principal bundle P whose base space is space or spacetime and structure group is a Lie group, then the
sections of P form a principal homogeneous space of the group of gauge transformations.
connection (gauge connection) define this principal bundle, yielding a covariant derivative V in each associated
vector bundle. If a local frame is chosen (a local basis of sections), then this covariant derivative is represented by
the connection form A, a Lie algebra- valued 1-form which is called the gauge potential in physics. This is evidently
not an intrinsic but a frame-dependent quantity. The curvature form F is constructed from a connection form, a Lie
algebra- valued 2-form which is an intrinsic quantity, by
F = d A + A A A
Nonabelian Gauge Theory
284
where d stands for the exterior derivative and A stands for the wedge product. ( A is an element of the vector space
spanned by the generators y a '» an d so the components of A do not commute with one another. Hence the wedge
product A A A does not vanish.)
Infinitesimal gauge transformations form a Lie algebra, which is characterized by a smooth Lie algebra valued
scalar, 8. Under such an infinitesimal gauge transformation,
Also, 6 £ F = eF, which means Ftransforms covariantly.
Not all gauge transformations can be generated by infinitesimal gauge transformations in general. An example is
when the base manifold is a compact manifold without boundary such that the homotopy class of mappings from that
manifold to the Lie group is nontrivial. See instanton for an example.
The Yang— Mills action is now given by
where * stands for the Hodge dual and the integral is defined as in differential geometry.
A quantity which is gauge-invariant i.e. invariant under gauge transformations is the Wilson loop, which is defined
over any closed path, y, as follows:
where % is the character of a complex representation p and *p represents the path-ordered operator.
Quantization of gauge theories
Gauge theories may be quantized by specialization of methods which are applicable to any quantum field theory.
However, because of the subtleties imposed by the gauge constraints (see section on Mathematical formalism,
above) there are many technical problems to be solved which do not arise in other field theories. At the same time,
the richer structure of gauge theories allow simplification of some computations: for example Ward identities
connect different renormalization constants.
Methods and aims
The first gauge theory to be quantized was quantum electrodynamics (QED). The first methods developed for this
involved gauge fixing and then applying canonical quantization. The Gupta-Bleuler method was also developed to
handle this problem. Non-abelian gauge theories are now handled by a variety of means. Methods for quantization
are covered in the article on quantization.
The main point to quantization is to be able to compute quantum amplitudes for various processes allowed by the
theory. Technically, they reduce to the computations of certain correlation functions in the vacuum state. This
involves a renormalization of the theory.
When the running coupling of the theory is small enough, then all required quantities may be computed in
perturbation theory. Quantization schemes intended to simplify such computations (such as canonical quantization)
may be called perturbative quantization schemes. At present some of these methods lead to the most precise
experimental tests of gauge theories.
However, in most gauge theories, there are many interesting questions which are non-perturbative. Quantization
schemes suited to these problems (such as lattice gauge theory) may be called non-perturbative quantization
S £ A = [e, A] — de
where -]is the Lie bracket.
One nice thing is that if 6 £ X = eX , then 8 £ DX = sDX where D is the co variant derivative
DX = dX + AX.
Nonabelian Gauge Theory
285
schemes. Precise computations in such schemes often require supercomputing, and are therefore less well-developed
currently than other schemes.
Anomalies
Some of the symmetries of the classical theory are then seen not to hold in the quantum theory — a phenomenon
called an anomaly. Among the most well known are:
• The scale anomaly, which gives rise to a running coupling constant. In QED this gives rise to the phenomenon of
the Landau pole. In Quantum Chromodynamics (QCD) this leads to asymptotic freedom.
• The chiral anomaly in either chiral or vector field theories with fermions. This has close connection with topology
through the notion of instantons. In QCD this anomaly causes the decay of a pion to two photons.
• The gauge anomaly, which must cancel in any consistent physical theory. In the electroweak theory this
cancellation requires an equal number of quarks and leptons.
Pure gauge
A pure gauge is the set of field configurations obtained by a gauge transformation on the null field configuration. So
it is a particular "gauge orbit" in the field configuration's space.
In the abelian case, where A^[x) — v A f ^{x) = A^{po) + d^f{x) , the pure gauge is the set of field
configurations A'^x) = d fl f(x) for all f(x).
Bibliography
General readers:
• Schumm, Bruce (2004) Deep Down Things. Johns Hopkins University Press. Esp. chpt. 8. A serious attempt by a
physicist to explain gauge theory and the Standard Model with little formal mathematics.
Texts:
• Bromley, D.A. (2000). Gauge Theory of Weak Interactions. Springer. ISBN 3-540-67672-4.
• Cheng, T.-P.; Li, L.-F. (1983). Gauge Theory of Elementary Particle Physics. Oxford University Press.
ISBN 0-19-851961-3.
• Frampton, P. (2008). Gauge Field Theories (3rd ed.). Wiley-VCH.
• Kane, G.L. (1987). Modern Elementary Particle Physics. Perseus Books. ISBN 0-201-1 1749-5.
Articles:
• Becchi, C. (1997). Introduction to Gauge Theories ^\
• Gross, D. (1992). "Gauge theory - Past, Present and Future" [5] . Retrieved 2009-04-23.
• Jackson, J.D. (2002). "From Lorenz to Coulomb and other explicit gauge transformations" Am.J.Phys 70:
917-928. doi:10.1119/l. 1491265.
• Svetlichny, George (1999). Preparation for Gauge Theory \
Nonabelian Gauge Theory
286
External links
• Yang-Mills equations on Dispersive Wiki
rm
• Gauge theories on Scholarpedia
References
[1] Wolfgang Pauli (1941) " Relativistic Field Theories of Elementary Particles, (http://prola.aps.org/abstract/RMP/vl3/i3/p203_l)" Rev.
Mod. Phys. 13: 203-32.
[2] Pickering, A. (1984). Constructing Quarks (http://www.amazon.conVConstructing-Quarks-Sociological-History-Particle/dp/0226667995/
ref=pd_bbs_sr_l?ie=UTF8&s=books&qid=1235837296&sr=8-l). University of Chicago Press. ISBN 0226667995. .
[3] Sakurai, Advanced Quantum Mechanics, sect 1-4
[4] http://arxiv.org/abs/hep-ph/970521 1
[5 ] http : //p sroc . phy s . ntu . edu . tw/ cjp/v3 0/95 5 . pdf
[6] http://arxiv.org/abs/physics/0204034
[7] http://arxiv.org/abs/math-ph/9902027
[8] http : //tosio . math . toronto . edu/ wiki/ index . php/ Yang-Mills_equations
[9] http : / / www . scholarpedia. org/ article/ Gauge_theories
List of quantum field theories
List of quantum field theories:
• Chern-Simons model
• Chiral model
• Gross-Neveu
• Kondo model
• Lower dimensional quantum field theory
• Minimal model
• Nambu-Jona-Lasinio
• Noncommutative quantum field theory
• Nonlinear sigma model
• Phi to the fourth
• Quantum chromodynamics
• Quantum electrodynamics
• Quantum flavordynamics
• Quantum Yang-Mills theory
• Schwinger model
• Sine-Gordon
• Standard model
• String Theory
• Thirring model
• Toda field theory
• Topological quantum field theory
• Wess-Zumino model
• Wess-Zumino-Witten model
• Yang-Mills
• Yang-Mills-Higgs model
• Yukawa model
Noncommutative geometry
287
Noncommutative geometry
Noncommutative geometry (NCG), is a branch of mathematics concerned with geometric approach to
noncommutative algebras, and with construction of spaces which are locally presented by noncommutative algebras
of functions (possibly in some generalized sense). A noncommutative algebra is here an associative algebra in which
the multiplication is not commutative, that is, for which xy does not always equal yx\ or more generally an algebraic
structure in which one of the principal binary operations is not commutative; one also allows additional structures,
e.g. topology or norm to be possibly carried by the noncommutative algebra of functions. The leading direction in
noncommutative geometry has been laid by French mathematician Alain Connes since his involvement from about
1979.
Motivation
Main motivation is to extend the commutative duality between spaces and functions to the noncommutative setting.
In mathematics, there is a close relationship between spaces, which are geometric in nature, and the numerical
functions on them. In general, such functions will form a commutative ring. For instance, one may take the ring C(X)
of continuous complex- valued functions on a topological space X. In many important cases (e.g., if X is a compact
Hausdorff space), we can recover X from C(X), and therefore it makes some sense to say that X has commutative
geometry.
More specifically, in topology, compact Hausdorff topological spaces can be reconstructed from the Banach algebra
of functions on the space (Gel'fand-Neimark). In commutative algebraic geometry, algebraic schemes are locally
prime spectra of commutative unital rings (A. Grothendieck), and schemes can be reconstructed from the categories
of quasicoherent sheaves of modules on them (P. Gabriel-A. Rosenberg). For Grothendieck topologies, the
cohomological properties of a site are invariant of the corresponding category of sheaves of sets viewed abstractly as
a topos (A. Grothendieck). In all these cases, a space is reconstructed from the algebra of functions or its categorified
version — some category of sheaves on that space.
Functions on a topological space can be multiplied and added pointwise hence they form a commutative algebra; in
fact these operations are local in the topology of the base space, hence the functions form a sheaf of commutative
rings over the base space.
The dream of noncommutative geometry is to generalize this duality to the duality between
• noncommutative algebras, or sheaves of noncommutative algebras, or sheaf-like noncommutative algebraic or
operator-algebraic structures
• and geometric entities of certain kind,
and interact between the algebraic and geometric description of those via this duality.
Regarding that the commutative rings correspond to usual affine schemes, and commutative C*-algebras to usual
topological spaces, the extension to noncommutative rings and algebras requires non-trivial generalization of
topological spaces, as "non-commutative spaces". For this reason, some talk about non-commutative topology,
though the term has also other meanings.
Noncommutative geometry
288
Applications in mathematical physics
Some applications in particle physics are described on the entries Noncommutative standard model and
Noncommutative quantum field theory. Sudden rise in interest in noncommutative geometry in physics, follows after
the speculations of its role in M-theory made in 1997^ .
Motivation from ergodic theory
Some of the theory developed by Alain Connes to handle noncommutative geometry at a technical level has roots in
older attempts, in particular in ergodic theory. The proposal of George Mackey to create a virtual subgroup theory,
with respect to which ergodic group actions would become homogeneous spaces of an extended kind, has by now
been subsumed.
Non-commutative C* -algebras, von Neumann algebras
(The formal duals of) non-commutative C*-algebras are often now called non-commutative spaces. This is by
analogy with the Gelfand representation, which shows that commutative C* -algebras are dual to locally compact
Hausdorff spaces. In general, one can associate to any C*-algebra S a topological space S; see spectrum of a
C*-algebra.
For the duality between a-finite measure spaces and commutative von Neumann algebras, noncommutative von
Neumann algebras are called non-commutative measure spaces.
Non-commutative differentiable manifolds
A smooth Riemannian manifold M is a topological space with a lot of extra structure. From its algebra of continuous
functions C(M) we only recover M topologically. The algebraic invariant that recovers the Riemannian structure is a
spectral triple. It is constructed from a smooth vector bundle E over M, e.g. the exterior algebra bundle. The Hilbert
space L2(M,E) of square integrable sections of E carries a representation of C(M) by multiplication operators, and we
consider an unbounded operator D in L^M^) with compact resolvent (e.g. the signature operator), such that the
commutators [D,f] are bounded whenever / is smooth. A recent deep theorem states that M as a Riemannian
manifold can be recovered from this data.
This suggests that one might define a noncommutative Riemannian manifold as a spectral triple (A,H,D), consisting
of a representation of a C*-algebra A on a Hilbert space H, together with an unbounded operator D on H, with
compact resolvent, such that [D,a] is bounded for all a in some dense subalgebra of A. Research in spectral triples is
very active, and many examples of noncommutative manifolds have been constructed.
Non-commutative affine and projective schemes
In analogy to the duality between affine schemes and commutative rings, we define a category of noncommutative
affine schemes as the dual of the category of associative unital rings. There are certain analogues of Zariski topology
in that context so that one can glue such affine schemes to more general objects.
There are also generalizations of the Cone and of the Proj of a commutative graded ring, mimicking a Serre's
theorem on Proj. Namely the category of quasicoherent sheaves of O-modules on a Proj of a commutative graded
algebra is equivalent to the category of graded modules over the ring localized on Serre's subcategory of graded
modules of finite length; there is also analogous theorem for coherent sheaves when the algebra is Noetherian. This
theorem is extended as a definition of noncommutative projective geometry by Michael Artin and J. J. Zhang ,
who add also some general ring-theoretic conditions (e.g. Artin- Schelter regularity).
Many properties of projective schemes extend to this context. For example, there exist an analog of the celebrated
Serre duality for noncommutative projective schemes of Artin and Zhang .
Noncommutative geometry
289
A. L. Rosenberg has created a rather general relative concept of noncommutative quasicompact scheme (over a
base category), abstracting the Grothendieck's study of morphisms of schemes and covers in terms of categories of
Mi
quasicoherent sheaves and flat localization functors . There is also another interesting approach via localization
theory, due to Fred Van Oystaeyen, Luc Willaert and Alain Verschoeren, where the main concept is that of a
schematic algebra 1 ^ .
Invariants for noncommutative spaces
Some of the motivating questions of the theory are concerned with extending known topological invariants to formal
duals of noncommutative (operator) algebras and other replacements and candidates for noncommutative spaces.
One of the main starting points of the Alain Connes' direction in noncommutative geometry is his spectacular
discovery (and independently by Boris Tsygan) of a very important new homology theory associated to
noncommutative associative algebras and noncommutative operator algebras, namely the cyclic homology and its
relations to the algebraic K-theory (primarily via Connes-Chern character map).
The theory of characteristic classes of smooth manifolds has been extended to spectral triples, employing the tools of
operator K-theory and cyclic cohomology. Several generalizations of now classical index theorems allow for
effective extraction of numerical invariants from spectral triples. The fundamental characteristic class in cyclic
cohomology, the JLO cocycle, generalizes the classical Chern character.
Examples of non-commutative spaces
• In Weyl quantization, the symplectic phase space of classical mechanics is deformed into a non-commutative
phase space generated by the position and momentum operators.
• The standard model of particle physics is another example of a noncommutative geometry, cf noncommutative
standard model.
• The noncommutative torus, deformation of the function algebra of the ordinary torus, can be given the structure
of a spectral triple. This class of examples has been studied intensively and still functions as a test case for more
complicated situations.
• Snyder space ^
• Noncommutative algebras arising from foliations.
• Examples related to dynamical systems arising from number theory, such as the Gauss shift on continued
fractions, give rise to noncommutative algebras that appear to have interesting noncommutative geometries.
Notes
[1] Alain Connes, Michael R. Douglas, Albert Schwarz, Noncommutative geometry and matrix theory: compactification on tori. J. High Energy
Phys. 1998, no. 2, Paper 3, 35 pp. doi (http://dx.doi.org/10.1088/1126-6708/1998/02/003), hep-th/9711162 (http://arxiv.org/abs/
hep-th/9711162)
[2] M. Artin, J. J. Zhang, Noncommutative projective schemes, Adv. Math. 109 (1994), no. 2, 228-287, doi (http://dx.doi.org/10.1006/aima.
1994.1087)
[3] Amnon Yekutieli, James J. Zhang, Serre duality for noncommutative projective schemes, Proc. Amer. Math. Soc. 125, n. 3, 1997, 697-707,
pdf (https://www.ams.org/proc/1997-125-03/S0002-9939-97-03782-9/S0002-9939-97-03782-9.pdf)
[4] A. L. Rosenberg, Noncommutative schemes, Compositio Math. 112 (1998) 93-125, doi (http://dx.doi.Org/10.1023/A:1000479824211);
Underlying spaces of noncommutative schemes, preprint MPIM2003-111, dvi (http://www.mpim-bonn.mpg.de/preprints/send?bid=1947),
ps (http://www.mpim-bonn.mpg. de/preprints/send?bid=1948); MSRI lecture Noncommutative schemes and spaces (Feb 2000): video
(http :// www . msri . org/ publications/In/ msri/ 2000/ interact/ ro senberg/ 1 / index . html)
[5] Freddy van Oystaeyen, Algebraic geometry for associative algebras, ISBN 0-8247-0424-X - New York: Dekker, 2000.- 287 p. - (Monographs
and textbooks in pure and applied mathematics , 232); F. van Oystaeyen, L. Willaert, Grothendieck topology, coherent sheaves and Serre's
theorem for schematic algebras, J. Pure Appl. Alg. 104 (1995), p. 109-122
[6] H. S. Snyder, Quantized Space-Time, Phys. Rev. 71 (1947) 38
Noncommutative geometry
290
References
• Connes, Alain (1994), Non-commutative geometry (http://www.alainconnes.org/docs/book94bigpdf.pdf),
Boston, MA: Academic Press, ISBN 978-0-12-185860-5
• Connes, Alain; Marcolli, Matilde (2008), "A walk in the noncommutative garden" (http://arxiv.org/abs/math/
0601054), An invitation to noncommutative geometry, World Sci. Publ., Hackensack, NJ, pp. 1-128, MR2408150
• Connes, Alain; Marcolli, Matilde (2008), Noncommutative geometry, quantum fields and motives (http://www.
alainconnes.org/docs/bookwebfinal.pdf), American Mathematical Society Colloquium Publications, 55,
Providence, R.I.: American Mathematical Society, MR2371808, ISBN 978-0-8218-4210-2
• Gracia-Bondia, Jose M; Figueroa, Hector; Varilly, Joseph C (2000), Elements of N on- commutative geometry,
Birkhauser, ISBN 978-0817641245
• Landi, Giovanni (1997), An introduction to noncommutative spaces and their geometries (http://arxiv.org/abs/
hep-th/9701078), Lecture Notes in Physics. New Series m: Monographs, 51, Berlin, New York: Springer- Verlag,
MR1482228, ISBN 978-3-540-63509-3
• Van Oystaeyen, Fred; Verschoren, Alain (1981), Non-commutative algebraic geometry, Lecture Notes in
Mathematics, 887, Springer- Verlag, ISBN 978-3540111535
External links
• Introduction to Quantum Geometry (http://www.matem.unam.mx/~micho/papers/qgeom.pdf) by Micho
Durdevich
• Lectures on Noncommutative Geometry (http://arxiv.org/abs/math/0506603) by Victor Ginzburg
• Very Basic Noncommutative Geometry (http://arxiv.org/abs/math/0408416) by Masoud Khalkhali
• Lectures on Arithmetic Noncommutative Geometry (http://arxiv.org/abs/math.qa/0409520) by Matilde
Marcolli
• Noncommutative Geometry for Pedestrians (http://arxiv.org/abs/gr-qc/9906059) by J. Madore
• An informal introduction to the ideas and concepts of noncommutative geometry (http://arxiv.org/abs/math-ph/
0612012) by Thierry Mas son (an easier introduction that is still rather technical)
• Noncommutative geometry on arxiv.org (http://xstructure.inr.ac.ru/x-bin/subthemes3.py ?level=2&
indexl=-173391&skip=0)
• MathOverflow, Theories of Noncommutative Geometry (http://mathoverflow.net/questions/10512/
theories-of-noncommutative-geometry)
• S. Mahanta, On some approaches towards non-commutative algebraic geometry, math.QA/0501166 (http://
arxiv.org/abs/math/0501 166)
Quantum gravity
291
Quantum gravity
Quantum gravity (QG) is the field of theoretical physics attempting to unify quantum mechanics with general
relativity in a self-consistent manner, or more precisely, to formulate a self-consistent theory which reduces to
2
ordinary quantum mechanics in the limit of weak gravity (potentials much less than c ) and which reduces to
Einsteinian general relativity in the limit of large actions (action much larger than reduced Planck's constant). The
theory must be able to predict the outcome of situations where both quantum effects and strong-field gravity are
important (at the Planck scale, unless large extra dimension conjectures are correct). Motivation for quantizing
gravity comes from the remarkable success of the quantum theories of the other three fundamental interactions.
Although some quantum gravity theories such as string theory and other so-called theories of everything attempt to
unify gravity with the other fundamental forces, others such as loop quantum gravity make no such attempt; they
simply quantize the gravitational field while keeping it separate from the other forces.
Observed physical phenomena in the early 21st century can be described well by quantum mechanics or general
relativity, without needing both. This can be thought of as due to an extreme separation of mass scales at which they
are important. Quantum effects are usually important only for the "very small", that is, for objects no larger than
typical molecules. General relativistic effects, on the other hand, show up only for the "very large" bodies such as
collapsed stars. (Planets' gravitational fields, as of 2009, are well-described by linearized gravity; so strong-field
2
effects — any effects of gravity beyond lowest nonvanishing order in cp/c — have not been observed even in the
gravitational fields of planets and main sequence stars). There is a lack of experimental evidence relating to quantum
gravity and classical physics adequately describes the observed effects of gravity over a range of 50 orders of
-23 30
magnitude of mass, i.e. for masses of objects from about 10 to 10 kg.
Overview
Much of the difficulty in meshing these theories at all energy scales comes from the different assumptions that these
theories make on how the universe works. Quantum field theory depends on particle fields embedded in the flat
space-time of special relativity. General relativity models gravity as a curvature within space-time that changes as a
gravitational mass moves. Historically, the most obvious way of combining the two (such as treating gravity as
simply another particle field) ran quickly into what is known as the renormalization problem. In the old-fashioned
understanding of renormalization, gravity particles would attract each other and adding together all of the
interactions results in many infinite values which cannot easily be cancelled out mathematically to yield sensible,
finite results. This is in contrast with quantum electrodynamics where, while the series still do not converge, the
interactions sometimes evaluate to infinite results, but those are few enough in number to be removable via
renormalization.
Effective field theories
Quantum gravity can be treated as an effective field theory. Effective quantum field theories come with some
high-energy cutoff, beyond which we do not expect that the theory provides a good description of nature. The
"infinities" then become large but finite quantities proportional to this finite cutoff scale, and correspond to processes
that involve very high energies near the fundamental cutoff. These quantities can then be absorbed into an infinite
collection of coupling constants, and at energies well below the fundamental cutoff of the theory, to any desired
precision; only a finite number of these coupling constants need to be measured in order to make legitimate
quantum-mechanical predictions. This same logic works just as well for the highly successful theory of low-energy
pions as for quantum gravity. Indeed, the first quantum-mechanical corrections to graviton- scattering and Newton's
law of gravitation have been explicitly computed^ (although they are so astronomically small that we may never be
able to measure them). In fact, gravity is in many ways a much better quantum field theory than the Standard Model,
since it appears to be valid all the way up to its cutoff at the Planck scale. (By comparison, the Standard Model is
Quantum gravity
292
expected to start to break down above its cutoff at the much smaller scale of around 1000 GeV.)
While confirming that quantum mechanics and gravity are indeed consistent at reasonable energies, it is clear that
near or above the fundamental cutoff of our effective quantum theory of gravity (the cutoff is generally assumed to
be of order the Planck scale), a new model of nature will be needed. Specifically, the problem of combining quantum
mechanics and gravity becomes an issue only at very high energies, and may well require a totally new kind of
model.
Quantum gravity theory for the highest energy scales
The general approach to deriving a quantum gravity theory that is valid at even the highest energy scales is to
assume that such a theory will be simple and elegant and, accordingly, to study symmetries and other clues offered
by current theories that might suggest ways to combine them into a comprehensive, unified theory. One problem
with this approach is that it is unknown whether quantum gravity will actually conform to a simple and elegant
theory, as it should resolve the dual conundrums of special relativity with regard to the uniformity of acceleration
and gravity, and general relativity with regard to spacetime curvature.
Such a theory is required in order to understand problems involving the combination of very high energy and very
small dimensions of space, such as the behavior of black holes, and the origin of the universe.
Quantum mechanics and general relativity
The graviton
At present, one of the deepest problems in theoretical physics is harmonizing
the theory of general relativity, which describes gravitation, and applies to
large-scale structures (stars, planets, galaxies), with quantum mechanics,
which describes the other three fundamental forces acting on the atomic scale.
This problem must be put in the proper context, however. In particular,
contrary to the popular claim that quantum mechanics and general relativity
are fundamentally incompatible, one can demonstrate that the structure of
general relativity essentially follows inevitably from the quantum mechanics
of interacting theoretical spin-2 massless particles ^ ^ ^ ^ ^ (called
gravitons).
While there is no concrete proof of the existence of gravitons, quantized
theories of matter may necessitate their existence. Supporting this theory is
the observation that all other fundamental forces have one or more messenger
particles, except gravity, leading researchers to believe that at least one most
likely does exist; they have dubbed these hypothetical particles gravitons.
Many of the accepted notions of a unified theory of physics since the 1970s,
including string theory, superstring theory, M-theory, loop quantum gravity,
all assume, and to some degree depend upon, the existence of the graviton.
Many researchers view the detection of the graviton as vital to validating their
work.
Gravity Probe B (GP-B) has measured
spacetime curvature near Earth to test
related models in application of
Einstein's general theory of relativity.
Quantum gravity
293
The dilaton
The dilaton made its first appearance in Kaluza-Klein theory, a five-dimensional theory that combined gravitation
and electromagnetism. Generally, it appears in string theory. More recently, it has appeared in the lower-dimensional
many-bodied gravity problem 1 based on the field theoretic approach of Roman Jackiw. The impetus arose from the
fact that complete analytical solutions for the metric of a covariant Af-body system have proven elusive in General
Relativity. To simplify the problem, the number of dimensions was lowered to (1+1) namely one spatial dimension
roi
and one temporal dimension. This model problem, known as R=T theory (as opposed to the general G=T theory)
was amenable to exact solutions in terms of a generalization of the Lambert W function. It was also found that the
field equation governing the dilaton (derived from differential geometry) was none other than the Schrodinger
rm
equation and consequently amenable to quantization. 1 Thus, one had a theory which combined gravity, quantization
and even the electromagnetic interaction, promising ingredients of a fundamental physical theory. It is worth noting
that the outcome revealed a previously unknown and already existing natural link between general relativity and
quantum mechanics. However, this theory needs to be generalized in (2+1) or (3+1 ) dimensions although, in
principle, the field equations are amenable to such generalization. It is not yet clear what field equation will govern
the dilaton in higher dimensions. This is further complicated by the fact that gravitons can propagate in (3+1)
dimensions and consequently that would imply gravitons and dilatons exist in the real world. Moreover, detection of
the dilaton is expected to be even more elusive than the graviton. However, since this approach allows for the
combination of gravitational, electromagnetic and quantum effects, their coupling could potentially lead to a means
of vindicating the theory, through cosmology and perhaps even experimentally.
Nonrenormalizability of gravity
General relativity, like electromagnetism, is a classical field theory. One might expect that, as with
electromagnetism, there should be a corresponding quantum field theory.
However, gravity is nonrenormalizableJ 10 ^ For a quantum field theory to be well-defined according to this
understanding of the subject, it must be asymptotically free or asymptotically safe. The theory must be characterized
by a choice of finitely many parameters, which could, in principle, be set by experiment. For example, in quantum
electrodynamics, these parameters are the charge and mass of the electron, as measured at a particular energy scale.
On the other hand, in quantizing gravity, there are infinitely many independent parameters needed to define the
theory. For a given choice of those parameters, one could make sense of the theory, but since we can never do
infinitely many experiments to fix the values of every parameter, we do not have a meaningful physical theory:
• At low energies, the logic of the renormalization group tells us that, despite the unknown choices of these
infinitely many parameters, quantum gravity